AI safety governance/strategy research & field-building.

Formerly a PhD student in clinical psychology @ UPenn, college student at Harvard, and summer research fellow at the Happier Lives Institute.


The field is not ready, and it's not going to suddenly become ready tomorrow. We need urgent and decisive action, but to indefinitely globally halt progress toward this technology that threatens our lives and our children's lives, not to accelerate ourselves straight off a cliff.

I think most advocacy around international coordination (that I've seen, at least) has this sort of vibe to it. The claim is "unless we can make this work, everyone will die."

I think this is an important point to be raising– and in particular I think that efforts to raise awareness about misalignment + loss of control failure modes would be very useful. Many policymakers have only or primarily heard about misuse risks and CBRN threats, and the "policymaker prior" is usually to think "if there is a dangerous, tech the most important thing to do is to make the US gets it first."

But in addition to this, I'd like to see more "international coordination advocates" come up with concrete proposals for what international coordination would actually look like. If the USG "wakes up", I think we will very quickly see that a lot of policymakers + natsec folks will be willing to entertain ambitious proposals.

By default, I expect a lot of people will agree that international coordination in principle would be safer but they will fear that in practice it is not going to work. As a rough analogy, I don't think most serious natsec people were like "yes, of course the thing we should do is enter into an arms race with the Soviet Union. This is the safeest thing for humanity."

Rather, I think it was much more a vibe of "it would be ideal if we could all avoid an arms race, but there's no way we can trust the Soviets to follow-through on this." (In addition to stuff that's more vibesy and less rational than this, but I do think insofar as logic and explicit reasoning were influential, this was likely one of the core cruses.)

In my opinion, one of the most important products for "international coordination advocates" to produce is some sort of concrete plan for The International Project. And importantly, it would need to somehow find institutional designs and governance mechanisms that would appeal to both the US and China. Answering questions like "how do the international institutions work", "who runs them", "how are they financed", and "what happens if the US and China disagree" will be essential here.

The Baruch Plan and the Acheson-Lilienthal Report (see full report here) might be useful sources of inspiration.

P.S. I might personally spend some time on this and find others who might be interested. Feel free to reach out if you're interested and feel like you have the skillset for this kind of thing.

Potentially Pavel Izmailov– not sure if he is related to the EA community and not sure the exact details of why he was fired.


Thanks! Familiar with the post— another way of framing my question is “has Holden changed his mind about anything in the last several months? Now that we’ve had more time to see how governments and labs are responding, what are his updated views/priorities?”

(The post, while helpful, is 6 months old, and I feel like the last several months has given us a lot more info about the world than we had back when RSPs were initially being formed/released.)

Congratulations on the new role– I agree that engaging with people outside of existing AI risk networks has a lot of potential for impact.

Besides RSPs, can you give any additional examples of approaches that you're excited about from the perspective of building a bigger tent & appealing beyond AI risk communities? This balancing act of "find ideas that resonate with broader audiences" and "find ideas that actually reduce risk and don't merely serve as applause lights or safety washing" seems quite important. I'd be interested in hearing if you have any concrete ideas that you think strike a good balance of this, as well as any high-level advice for how to navigate this.

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments (you can't do X or you can't do X unless Y), preparedness from governments (you can keep doing X but if we see Y then we're going to do Z), or other governance mechanisms? 

(I'll note I ask these partially as someone who has been pretty disappointed in the ultimate output from RSPs, though there's no need to rehash that debate here– I am quite curious for how you're reasoning through these questions despite some likely differences in how we think about the success of previous efforts like RSPs.)


Congrats to Zach! I feel like this is mostly supposed to be a "quick update/celebratory post", but I feel like there's a missing mood that I want to convey in this comment. Note that my thoughts mostly come from an AI Safety perspective, so these thoughts may be less relevant for folks who focus on other cause areas.

My impression is that EA is currently facing an unprecedented about of PR backlash, as well as some solid internal criticisms among core EAs who are now distancing from EA. I suspect this will likely continue into 2024. Some examples:

  • EA has acquired several external enemies as a result of the OpenAI coup. I suspect that investors/accelerationists will be looking for ways to (further) damage EA's reputation.
  • EA is acquiring external enemies as a result of its political engagements. There have been a few news articles recently criticizing EA-affiliated or EA-influenced fellowship programs and think-tanks.
  • EA is acquiring an increasing number of internal critics. Informally, I feel like many people I know (myself included) have become increasingly dissatisfied with the "modern EA movement" and "mainstream EA institutions". Examples of common criticisms include "low integrity/low openness", "low willingness to critique powerful EA institutions", "low willingness to take actions in the world that advocate directly/openly for beliefs", "cozyness with AI labs", "general slowness/inaction bias", and "lack of willingness to support groups pushing for concrete policies to curb the AI race." (I'll acknowledge that some of these are more controversial than others and could reflect genuine worldview differences, though even so, my impression is that they're meaningfully contributing to a schism in ways that go beyond typical worldview differences).

I'd be curious to know how CEA is reacting to this. The answer might be "well, we don't really focus much on AI safety, so we don't really see this as our thing to respond to." The answer might be "we think these criticisms are unfair/low-quality, so we're going to ignore them." Or the answer might be "we take X criticism super seriously and are planning to do Y about it."

Regardless, I suspect that this is an especially important and challenging time to be the CEO of CEA. I hope Zach (and others at CEA) are able to navigate the increasing public scrutiny & internal scrutiny of EA that I suspect will continue into 2024.

Do you know anything about the strategic vision that Zach has for CEA? Or is this just meant to be a positive endorsement of Zach's character/judgment? 

(Both are useful; just want to make sure that the distinction between them is clear). 

I appreciate the comment, though I think there's a lack of specificity that makes it hard to figure out where we agree/disagree (or more generally what you believe).

If you want to engage further, here are some things I'd be excited to hear from you:

  • What are a few specific comms/advocacy opportunities you're excited about//have funded?
  • What are a few specific comms/advocacy opportunities you view as net negative//have actively decided not to fund?
  • What are a few examples of hypothetical comms/advocacy opportunities you've been excited about?
  • What do you think about EG Max Tegmark/FLI, Andrea Miotti/Control AI, The Future Society, the Center for AI Policy, Holly Elmore, PauseAI, and other specific individuals or groups that are engaging in AI comms or advocacy? 

I think if you (and others at OP) are interested in receiving more critiques or overall feedback on your approach, one thing that would be helpful is writing up your current models/reasoning on comms/advocacy topics.

In the absence of this, people simply notice that OP doesn't seem to be funding some of the main existing examples of comms/advocacy efforts, but they don't really know why, and they don't really know what kinds of comms/advocacy efforts you'd be excited about.

Answer by Akash25

I expect that your search for a "unified resource" will be unsatisfying. I think people disagree enough on their threat models/expectations that there is no real "EA perspective".

Some things you could consider doing:

  • Having a dialogue with 1-2 key people you disagree with
  • Pick one perspective (e.g., Paul's worldview, Eliezer's worldview) and write about areas you disagree with it.
  • Write up a "Matthew's worldview" doc that focuses more on explaining what you expect to happen and isn't necessarily meant as a "counterargument" piece. 

Among the questions you list, I'm most interested in these:

  • How bad human disempowerment would likely be from a utilitarian perspective
  • Whether there will be a treacherous turn event, during which AIs violently take over the world after previously having been behaviorally aligned with humans
  • How likely AIs are to kill every single human if they are unaligned with humans
  • How society is likely to respond to AI risks, and whether they'll sleepwalk into a catastrophe
