Ben_West🔸

Member of Technical Staff @ METR

15203 karmaJoined Sep 2014Working (15+ years)Panama City, Panama

🤷♂🤷♂🤷♂.ws

Message

Interests:

CommunityEntrepreneurshipBioethicsSoftware engineeringEffective givingForecasting

Bio

Non-EA interests include chess and TikTok (@benthamite). We are probably hiring: https://metr.org/hiring

How others can help me

Feedback always appreciated; feel free to email/DM me or use this link if you prefer to be anonymous.

Posts
88

Sorted by New

Ben_West's Quick takes

Ben_West🔸

· 4y ago · 1m read

166

149

A shallow review of what transformative AI means for animal welfare

Lizka, Ben_West🔸

· 11d ago · 25m read

122

METR: Measuring AI Ability to Complete Long Tasks

Ben_West🔸

· 4mo ago · 1m read

METR is hiring ML Research Engineers and Scientists

Ben_West🔸

· 1y ago · 1m read

100

CEA is hiring a Head of Communications

Ben_West🔸

· 1y ago · 1m read

209

Rates of Criminality Amongst Giving Pledge Signatories

Ben_West🔸

· 1y ago · 7m read

CEA is spinning out of Effective Ventures

Eli_Nathan, Ben_West🔸

· 2y ago · 1m read

105

Cause-Generality Is Hard If Some Causes Have Higher ROI

Ben_West🔸

· 2y ago · 7m read

182

Zach Robinson will be CEA’s next CEO

Ben_West🔸, Eli Rose, ClaireZabel, lincolnq+ 3 more

· 2y ago · 1m read

CEA is fundraising, and funding constrained

Ben_West🔸, Oscar Howie, Angelina Li

· 2y ago · 10m read

Here’s where CEA staff are donating in 2023

Oscar Howie, Catherine Low🔸, Shakeel Hashim+ 7 more

· 2y ago · 9m read

Sequences
3

AI Pause Debate Week

EA Hiring

EA Retention

Comments
1104

Topic contributions
6

A shallow review of what transformative AI means for animal welfare

Ben_West🔸3d6

These are good questions, unfortunately I don't feel very qualified to answer. One thing I did want to say though is that your comment made me realize that we were incorrectly (?) focused on a harm reduction frame. I'm not sure that our suggestions are very good if you want to do something like "maximize the expected number of rats on heroin".

My sense is that most AIxAnimals people are actually mostly focused on the harm reduction stuff so maybe it's fine that we didn't consider upside scenarios very much, but, to the extent that you do want to consider upside for animals, I'm not sure our suggestions hold. (Speaking for myself, not Lizka.)

The Rising Premium of Life

Ben_West🔸12d6

I thought this was pretty interesting, I didn't realize it had gone up so much. Thanks for posting!

A shallow review of what transformative AI means for animal welfare

Ben_West🔸12d4

I actually want to make both claims! I agree that it's true that, if the future looks basically like the present, then probably you don't need to care much about the paradigm shift (i.e. AI). But I also think the future will not look like the present so you should heavily discount interventions targeted at pre-paradigm-shift worlds unless they pay off soon.

A shallow review of what transformative AI means for animal welfare

Ben_West🔸12d12

Good question and thanks for the concrete scenarios! I think my tl;dr here is something like "even when you imagine 'normalish' futures, they are probably weirder than you are imagining."

Even if McDonald's fires all its staff, it's not clear to me why it would drop its cage-free policy

I don't think we want to make the claim that McDonald's will definitely drop its cage free policy but rather the weaker claim that you should not assume that the value of a cage-free commitment will remain ~constant by default.

If I'm assuming that we are in a world where all of the human labor at McDonald's has been automated away, I think that is a pretty weird world. As you note, even the existence of something like McDonald's (much less a specific corporate entity which feels bound by the agreements of current-day McDonald's) is speculative.

But even if we grant its existence: a ~40% egg price increase is currently enough that companies feel cover to be justified in abandoning their cage-free pledges. Surely "the entire global order has been upended and the new corporate management is robots" is an even better excuse?

And even if we somehow hold McDonald's to their pledge, I find it hard to believe that a world where McDonald’s can be run without humans does not quickly lead to a world where something more profitable than battery cage farming can be found. And, as a result, the cage-free pledge is irrelevant because McDonald’s isn’t going to use cages anyway. (Of course, this new farming method may be even more cruel than battery cages, illustrating one of the downsides of trying to lock in a specific policy change before we understand what the future will be like.)

To be clear: this is just me randomly spouting, I don't believe strongly in any of the above. I think it's possible someone could come up with a strong argument why present-day corporate pledges will continue post-paradigm-shift. But my point is that you shouldn’t assume that this argument exists by default.

AGI is more like the Internet. The cage-free McMuffins endure, just with some cool LLM-generated images on them.

Yeah I think this world is (by assumption) one where cage free pledges should not receive a massive discount.

No AGI

Note that some worlds where we wouldn't get AGI soon (e.g. large-scale nuclear war setting science back 200 years) are also probably not great for the expected value of cage-free pledges.

(It is good to hear though that even in the maximally dystopian world of universal Taco Bell there will be some upside for animals 🙂.)

A shallow review of what transformative AI means for animal welfare

Ben_West🔸12d*15

(Lizka and I have slightly different views here, speaking only for myself.)

This is a good question. The basic point is that, just as lock-in can prevent things from getting worse, it can also prevent things from getting better.

For example, the Universal Declaration of the Rights of Mother Earth says that all beings have the right to "not have its genetic structure modified or disrupted in a manner that threatens its integrity or vital and healthy functioning".

Even though I think this right could reasonably be interpreted as "animal-friendly", my guess is that it would be bad to lock it in because it would prevent us from e.g. genetically modifying farmed animals to feel less pain.

Road to AnimalHarmBench

Ben_West🔸16d6

I think this is a plausible principle to have, but it trades off against some other pretty plausible principles

I wasn't involved in making this benchmark but fwiw it feels pretty reasonable to me to separate the measurement of an attribute from the policy decision about how that attribute should trade off against other things. (Indeed, I expect that AI developers will be unbothered by creating models which cause animals harm if that provides economic benefits to them.)

Road to AnimalHarmBench

Ben_West🔸17d6

allowing competent LLMs-as-judges to consider different, possibly novel, ways how harms can come about from particular open-ended answers could allow foreseeing harms that even the best human judges could have had trouble foreseeing.

I think this is an interesting point but currently runs into the problem that the LLMs are not competent. The human judges only correlated with each other at around 0.5, which I suspect will be an upper bound for models in the near term.

Have you considered providing a rubric, at least until we get to the point where models' unstructured thought is better than our own? Also, do you have a breakdown of the scores by judge? I'm curious if anything meaningfully changes if you just decide to not trust the worst models and only use the best one as a judge.

Debate: Morality is Objective

Ben_West🔸22d2

I think if you grant something like "suffering is bad" you get (some form of) ethics, and this seems like a pretty minimal assumption. (Though I agree you can have an internally consistent view that suffering as good just as you can have an internally consistent view that you are a Boltzmann brain.)

Please reconsider your use of adjectives

Ben_West🔸23d6

Judgements are also less informative. Knowing that titotal thinks a model is bad is not super useful except insofar as you want to defer to them. Eli's response was basically "yeah we agree with the facts you state but disagree that this makes the model bad," which feels like a clear illustration of the limitations of just saying "X is bad."

Interstellar travel will probably doom the long-term future

Ben_West🔸24d18

This is one of the most interesting "Cause X" posts I've read in a while, thanks for writing it!

Ben_West🔸

Bio

How others can help me

Posts 88

Sequences 3

Comments1104

Topic contributions6

Posts
88

Sequences
3

Comments
1104

Topic contributions
6