Joining the Carnegie Endowment for International Peace

Holden Karnofsky

[anonymous]

186

I'm grateful that Cari and I met Holden when we did (and grateful to Daniela for luring him to San Francisco for that first meeting). The last fourteen years of our giving would have looked very different without his work, and I don't think we'd have had nearly the same level of impact — particularly in areas like farm animal welfare and AI that other advisors likely wouldn't have mentioned.

Adam_Scholl

2y

92

I also think Open Philanthropy would benefit from less ambiguity about my role in its funding decisions (especially given the fact that I’m married to the President of a major AI company).

This makes sense, but if anything the conflict of interest seems more alarming if you're influencing national policy. For example, I would guess that you are one of the people—maybe literally among the top 10?—who stands to personally lose the most money in the event of an AI pause. Are you worried about this, or taking any actions to mitigate it (e.g., trying to convert equity into cash?)

Holden Karnofsky

2y

51

My spouse isn't currently planning to divest the full amount of her equity. Some factors here: (a) It's her decision, not mine. (b) The equity has important voting rights, such that divesting or donating it in full could have governance implications. (c) It doesn't seem like this would have a significant marginal effect on my real or perceived conflict of interest: I could still not claim impartiality when married to the President of a company, equity or no. With these points in mind, full divestment or donation could happen in the future, but there's no immediate plan to do it.

The bottom line is that I have a significant conflict of interest that isn't going away, and I am trying to help reduce AI risk despite that. My new role will not have authority over grants or other significant resources besides my time and my ability to do analysis and make arguments. People encountering any analysis and arguments will have to decide how to weigh my conflict of interest for themselves, while considering arguments and analysis on the merits.

For whatever it's worth, I have publicly said that the world would pause AI development if it were all up to me, and I make persistent efforts to ensure people I'm interacting with know this. I also believe the things I advocate for would almost universally have a negative expected effect (if any effect) on the value of the equity I'm exposed to. But I don't expect everyone to agree with this or to be reassured by it.

aysja

2y

25

For context, Holden is married to Daniela Amodei, president and co-founder of Anthropic. She also used to work at OpenAI and still, I believe, holds equity there. As Holden has stated elsewhere: "I am married to the President of Anthropic and have a financial interest in both Anthropic and OpenAI via my spouse."

Holden Karnofsky

2y

42

> Besides RSPs, can you give any additional examples of approaches that you're excited about from the perspective of building a bigger tent & appealing beyond AI risk communities? This balancing act of "find ideas that resonate with broader audiences" and "find ideas that actually reduce risk and don't merely serve as applause lights or safety washing" seems quite important. I'd be interested in hearing if you have any concrete ideas that you think strike a good balance of this, as well as any high-level advice for how to navigate this.

I'm pretty focused on red lines, and I don't think I necessarily have big insights on other ways to build a bigger tent, but one thing I have been pretty enthused about for a while is putting more effort into investigating potentially concerning AI incidents in the wild. Based on case studies, I believe that exposing and helping the public understand any concerning incidents could easily be the most effective way to galvanize more interest in safety standards, including regulation. I'm not sure how many concerning incidents there are to be found in the wild today, but I suspect there are some, and I expect there to be more over time as AI capabilities advance.

> Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments (you can't do X or you can't do X unless Y), preparedness from governments (you can keep doing X but if we see Y then we're going to do Z), or other governance mechanisms?

The work as I describe it above is not specifically focused on companies. My focus is on hammering out (a) what AI capabilities might increase the risk of a global catastrophe; (b) how we can try to catch early warning signs of these capabilities (and what challenges this involves); and (c) what protective measures (for example, strong information security and alignment guarantees) are important for safely handling such capabilities. I hope that by doing analysis on these topics, I can create useful resources for companies, governments and other parties.

I suspect that companies are likely to move faster and more iteratively on things like this than governments at this stage, and so I often pay special attention to them. But I’ve made clear that I don’t think voluntary commitments alone are sufficient, and that I think regulation will be necessary to contain AI risks. (Quote from earlier piece: "And to be explicit: I think regulation will be necessary to contain AI risks (RSPs alone are not enough), and should almost certainly end up stricter than what companies impose on themselves.")

Evan R. Murphy

1y

1

one thing I have been pretty enthused about for a while is putting more effort into investigating potentially concerning AI incidents in the wild. Based on case studies, I believe that exposing and helping the public understand any concerning incidents could easily be the most effective way to galvanize more interest in safety standards, including regulation. I'm not sure how many concerning incidents there are to be found in the wild today, but I suspect there are some, and I expect there to be more over time as AI capabilities advance.

Interesting idea - I can see how exposing AI incidents could be important. This brought to my mind the paper Malla: Demystifying Real-world Large Language Model Integrated Malicious Services. (No affiliation with the paper, just one that I remember reading and we referenced in some Berkeley CLTC AI Security Initiative research earlier this year.) The researchers on the Malla paper dug into the dark web and uncovered hundreds of malicious services based on LLMs being distributed in the wild.

Ryan Greenblatt

2y

20

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments

This is discussed in Holden's earlier post on the topic here.

Greg_Colbourn ⏸️

2y

10

Congrats Holden! Just going to quote you from a recent post:

There’s a serious (>10%) risk that we’ll see transformative AI² within a few years.
In that case it’s not realistic to have sufficient protective measures for the risks in time.
Sufficient protective measures would require huge advances on a number of fronts, including information security that could take years to build up and alignment science breakthroughs that we can’t put a timeline on given the nascent state of the field, so even decades might or might not be enough time to prepare, even given a lot of effort.
If it were all up to me, the world would pause now

Please don't lose sight of this in your new role. Public opinion is on your side here, and PauseAI are gaining momentum. It's possible for this to happen. Please push for it in your new role! (And reduce your conflict of interest if possible!)

Siebe

2y

8