Introduction
In March 2023, we launched the Open Philanthropy AI Worldviews Contest. The goal of the contest was to surface novel considerations that could affect our views on the timeline to transformative AI and the level of catastrophic risk that transformative AI systems could pose. We received 135 submissions. Today we are excited to share the winners of the contest.
But first: We continue to be interested in challenges to the worldview that informs our AI-related grantmaking. To that end, we are awarding a separate $75,000 prize to the Forecasting Research Institute (FRI) for their recently published writeup of the 2022 Existential Risk Persuasion Tournament (XPT).[1] This award falls outside the confines of the AI Worldviews Contest, but the recognition is motivated by the same principles that motivated the contest. We believe that the results from the XPT constitute the best recent challenge to our AI worldview.
FRI Prize ($75k)
Existential Risk Persuasion Tournament by the Forecasting Research Institute
AI Worldviews Contest Winners
First Prizes ($50k)
- AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years by Basil Halperin, Zachary Mazlish, and Trevor Chow
- Evolution provides no evidence for the sharp left turn by Quintin Pope (see the LessWrong version to view comments)
Second Prizes ($37.5k)
- Deceptive Alignment is <1% Likely by Default by David Wheaton (see the LessWrong version to view comments)
- AGI Catastrophe and Takeover: Some Reference Class-Based Priors by Zach Freitas-Groff
Third Prizes ($25k)
- Imitation Learning is Probably Existentially Safe by Michael Cohen[2]
- ‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting by Alex Bates
Caveats on the Winning Entries
The judges do not endorse every argument and conclusion in the winning entries. Most of the winning entries argue for multiple claims, and in many instances the judges found some of the arguments much more compelling than others. In some cases, the judges liked that an entry crisply argued for a conclusion the judges did not agree with—the clear articulation of an argument makes it easier for others to engage. One does not need to find a piece wholly persuasive to believe that it usefully contributes to the collective debate about AI timelines or the threat that advanced AI systems might pose.
Submissions were many and varied. We can easily imagine a different panel of judges reasonably selecting a different set of winners. There are many different types of research that are valuable, and the winning entries should not be interpreted to represent Open Philanthropy’s settled institutional tastes on what research directions are most promising (i.e., we don’t want other researchers to overanchor on these pieces as the best topics to explore further).
- ^
We did not provide any funding specifically for the XPT, which ran from June 2022 through October 2022. In December 2022, we recommended two grants totaling $6.3M over three years to support FRI’s future research.
- ^
The link above goes to the version Michael submitted; he’s also written an updated version with coauthor Marcus Hutter.
Congratulations to the winners.
My question is, now that we have winners, how do we make the best use of this opportunity? What further actions would help people think better about such questions, either those at OP or otherwise?
This is complicated by the caveats. We don't know which of the points made are the ones the judges found to be interesting or useful, or which ones merely crystalized disagreements, and which ones were mostly rejected.
As is, my expectation is that the authors (of both the winners and other entries) put a lot of effort into this, and the judges a lot of effort into evaluations, and almost no one will know what to do with all of that and the opportunity will by default be wasted.
So if there's opportunity here, what is it? For me, or for others?
As another commentor notes, at the time I offered a rebuttal to the interest rates post, that I would still stand by almost verbatim, and I'm confused why this post was still judged so highly - or what we should be paying attention to there, beyond the one line 'the market's interest rates do not reflect the possibility of transformative AI.'
I will refrain from commenting on the others since I haven't given them a proper reading yet (or if I did, I don't remember it).
As a potential experiment/aid here, I created Manifold markets (1,2,3,4,5,6) on whether my review of these six would be retrospectively considered by me a good use of time.
I found your argument more interesting than the other "rebuttals." Halperin et.al's core argument is that there's a disjunction between EMH on AI and soonish TAI, and suggests this as evidence against soonish TAI.
The other rebuttals gave evidence for one fork in this disjunction (that EMH does not apply to AI), but your argument, if correct, suggests that the disjunction might not be there in the first place.