Evan R. Murphy

AI Alignment Researcher @ Independent/Non-profit
596 karmaJoined Working (6-15 years)Vancouver, BC, Canada


Formerly a software engineer at Google, now I'm doing independent AI alignment research.

Because of my focus on AI alignment, I tend to post more on LessWrong and AI Alignment Forum than I do here.

I'm always happy to connect with other researchers or people interested in AI alignment and effective altruism. Feel free to send me a private message!


Open Phil claims that campaigns to make more Americans go vegan and vegetarian haven't been very successful. But does this analysis account for immigration?

If people who already live in the US are shifting their diets, but new immigrants skew omnivore, a simple analysis could easily miss the former shift because immigration is fairly large in the US.

Source of Open Phil claim at :

But these advocates haven’t achieved the widespread dietary changes they’ve sought — and that boosters sometimes claim they have. Despite the claims6% of Americans aren’t vegan and vegetarianism hasn’t risen fivefold lately: Gallup polls show a constant 5-6% of Americans have identified as vegetarians since 1999 (Gallup found 2% identified as vegans the only time it asked, in 2012). The one credible poll showing vegetarianism doubling in recent years still found only 5-7% of Americans identifying as vegetarian in 2017 — consistent with the stable Gallup numbers.

Will the AI alignment Slack continue to run?

Thanks JJ and everyone who has worked on AISS for all your great work!

Peter Singer and Tse Yip Fai were doing some work on animal welfare relating to AI last year: It looks like Fai at least is still working in this area. But I'm not sure whether they have considered or initiated outreach to AGI labs, that seems like a great idea.

I place significant weight on the possibility that when labs are in the process of training AGI or near-AGI systems, they will be able to see alignment opportunities that we can't from a more theoretical or distanced POV. In this sense, I'm sympathetic to Anthropic's empirical approach to safety. I also think there are a lot of really smart and creative people working at these labs.

Leading labs also employ some people focused on the worst risks. For misalignment risks, I am most worried about deceptive alignment, and Anthropic recently hired one of the people who coined that term. (From this angle, I would feel safer about these risks if Anthropic were in the lead rather than OpenAI. I know less about OpenAI's current alignment team.)

Let me be clear though: Even if I'm right above and massively catastrophic misalignment risk one of these labs creating AGI is ~20%, I consider that very much an unacceptably high risk. I think even a 1% chance of extinction is unacceptably high. If some other kind of project had a 1% chance of causing human extinction, I don't think the public would stand for it. Imagine some particle accelerator or biotech project had a 1% chance of causing human extinction. If the public found out, I think they would want the project shut down immediately until it could be pursued safely. And I think they would be justified in that, if there's a way to coordinate on doing so.

A key part of my model right now relies on who develops the first AGI and on how many AGIs are developed.

If the first AGI is developed by OpenAI, Google DeepMind or Anthropic - all of whom seem relatively cautious (perhaps some more than others) - I put the chance of massively catastrophic misalignment at <20%.

If one of those labs is first and somehow able to prevent other actors from creating AGI after this, then that leaves my overall massively catastrophic misalignment risk at <20%. However, while I think it's likely one of these labs would be first, I'm highly uncertain about whether they would achieve the pivotal outcome of preventing subsequent AGIs.

So, if some less cautious actor overtakes the leading labs, or if the leading lab who first develops AGI cannot prevent many others from building AGI afterward, I view there's a much higher likelihood of massively catastrophic misalignment from one of these attempts to build AGI. In this scenario, my overall massively catastrophic misalignment risk is definitely >50%, and perhaps closer to the 75%~90% range.

You're right - I wasn't very happy with my word choice calling Google the 'engine of competition' in this situation. The engine was already in place and involves the various actors working on AGI and the incentives to do so. But these recent developments with Google doubling down on AI to protect their search/ad revenue are revving up that engine.

It's somewhat surprising to me the way this is shaking out. I would expect DeepMind and OpenAI's AGI research to be competing with one another*. But here it looks like Google is the engine of competition, less motivated by any future focused ideas about AGI more just by the fact that their core search/ad business model appears to be threatened by OpenAI's AGI research.

*And hopefully cooperating with one another too.

I think it's not quite right that low trust is costlier than high trust. Low trust is costly when things are going well. There's kind of a slow burn of additional cost.

But high trust is very costly when bad actors, corruption or mistakes arise that a low trust community would have preempted. So the cost is lumpier, cheap in the good times and expensive in the bad.

(I read fairly quickly so may have missed where you clarified this.)

If anyone consults a lawyer about this or starts the process with , it could be very useful to many of us if you followed up here and shared what your experience of the process was like.

Load more