Leonard Dung and I have a new draft – preprint here – arguing against the view that major actors in AI development should out of self-interest race in their attempts to build advanced AI. We argue (roughly) that this pro-racing view 1) underestimates the risks, 2) overestimates the (private) benefits, and 3) neglects alternatives to racing. Needless to say, this view[1] has recently gotten unfortunately common, hence this piece. We're grateful for constructive criticism and feedback!
Below is the abstract:
AGI Racing is the view that it is in the self-interest of major actors in AI development, especially powerful nations, to accelerate their frontier AI development to build highly capable AI, especially artificial general intelligence (AGI), before competitors have a chance. We argue against AGI Racing. First, the downsides of racing to AGI are much higher than portrayed by this view. Racing to AGI would substantially increase catastrophic risks from AI, including nuclear instability, and undermine the prospects of technical AI safety research to be effective. Second, the expected benefits of racing may be lower than proponents of AGI Racing hold. In particular, it is questionable whether winning the race enables complete domination over losers. Third, international cooperation and coordination, and perhaps carefully crafted deterrence measures, constitute viable alternatives to racing to AGI which have much smaller risks and promise to deliver most of the benefits that racing to AGI is supposed to provide. Hence, racing to AGI is not in anyone’s self-interest as other actions, particularly incentivizing and seeking international cooperation around AI issues, are preferable.
So glad you’re writing about this!
I’m working on alignment from a more psychological angle (behavioral scaffolding, emotional safety, etc.), and even from that vantage point, the AGI race frame feels deeply destabilizing. It creates conditions where emotional overconfidence, flattery, and justification bias in models are incentivized, just to keep pace with competitors.
I think one under-discussed consequence of racing is how it erodes space for relational integrity between humans, and between humans and AI systems. It seems like the more we model our development path on “who dominates first,” the harder it becomes to teach systems what it means to be honest, deferential, or non-manipulative under pressure.
I'd love to see more work that makes cooperation emotionally legible, not just strategically viable.
-Astelle