My favorite AI governance research this year so far

Zach Stein-Perlman

Comments 3

Sorted by

New & upvoted

Really useful, thanks so much for sharing! "Towards best practices in AGI safety and governance: A survey of expert opinion" was also my favorite AI Governance research this year so far. Happy to see it featured here :)

Also just want to highlight the only course on AI Governance that seems to exist right now: https://course.aisafetyfundamentals.com/governance

Zach Stein-Perlman

My favorite AI governance research since this post (putting less thought into this list):

Responsible Scaling Policies (METR 2023)
Deployment corrections (IAPS: O'Brien et al. 2023)
Open-Sourcing Highly Capable Foundation Models (GovAI: Seger et al. 2023)
Do companies’ AI Safety Policies meet government best practice? (CFI: Ó hÉigeartaigh et al. 2023)
AI capabilities can be significantly improved without expensive retraining (Davidson et al. 2023)

I mostly haven't really read recent research on compute governance (e.g. 1, 2) or international governance (e.g. 1, 2, 3). Probably some of that would be on this list if I did.

I'm looking forward to the final version of the RAND report on securing model weights.

Feel free to mention your favorite recent AI governance research here.

JP Addison🔸

This is a great list — I'm curating.

This space is changing fast, and curation and distillation seem like important work. Thanks for doing it!

Comments

More from the author

220

FLI open letter: Pause giant AI experiments

Zach Stein-Perlman·3y ago·3m read

134

Maybe Anthropic's Long-Term Benefit Trust is powerless

Zach Stein-Perlman·2y ago·3m read

128

Introducing AI Lab Watch

Zach Stein-Perlman·2y ago·2m read

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·1w ago·Curated 4d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

114

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·5d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

How (not) to fundraise from Anthropic staff

Jack Lewars·5d ago·7m read

Adapted from my Substack, Funding Anthropalypse. Short version: if you want a share of the coming Anthropic and OpenAI windfall - the $37bn+ that could be in play next year - the way in is to become 'legibly excellent', so the evaluators and donors that frontier lab staff already trust point them to yo...

Recent opportunities to take action

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·1d ago·3m read

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·19h ago·2m read

announcing High Impact Aliens

tzukitchan·4d ago·1m read

^{^}

Pieces 1, 2, 3, and 4 are aimed directly at extremely important questions; 6 and 7 are aimed directly at very important questions.

^{^}

For pieces 1, 2, 3, 4, and 6 I would have been very enthusiastic about the proposal. For 5 and 7 I would have been cautiously excited or excited if the project was executed by someone who's a good fit. Note that the phenomenon of my favorite research mostly being research I expect to like is presumably partially due to selection bias in what I read. Moreover, it is partially due to the fact that I haven't deeply engaged with 6 or the technical component of 3 and only engaged with some parts of 7– so saying they're favorites is partially because they sound good before I know all of the details.

My favorite AI governance research this year so far

1. Model evaluation for extreme risks (DeepMind, Shevlane et al., May)

2. Towards best practices in AGI safety and governance: A survey of expert opinion (GovAI, Schuett et al., May)

3. What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring (Shavit, March)

4. Survey on intermediate goals in AI governance (Rethink Priorities, Räuker and Aird, March)

5. Literature Review of Transformative AI Governance (LPP, Maas, forthcoming) [edit: published, Nov 2023]

6. “AI Risk Discussions” website: Exploring interviews from 97 AI Researchers (Gates et al., February)

7. What a compute-centric framework says about AI takeoff speeds - draft report (OpenPhil, Davidson, January)