Nuclear brinksmanship is not a good AI x-risk strategy

titotal

In a recent article, Eliezer Yudkowsky advocates for the following measures to stop AI development:

Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for anyone, including governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.

In this passage, Eliezer explicitly calls for airstrikes on "rogue datacentres", and for being willing to declare war on countries that build GPU clusters. The last sentences are somewhat vaguer, in that he does not specify what specific actions would both "run some risk of nuclear exchange" and "reduce the risk of large AI training runs". In the context of the previous passages, I think that the only interpretation that makes sense to me would be "threaten to bomb datacentres in nuclear armed countries". If Eliezer meant something else, I am open to being corrected and will edit this post. Regardless, I still think it's worth critiquing this idea, because some people have agreed with it, and I think it is a really, really, really bad idea.

Before I make my argument, I should say that the badness of the idea depends greatly on your beliefs about the inevitability of AI doom. I think it's plausible that this proposal makes sense for Yudkowsky, given his beliefs that p(doom) is near 100%. I'm not condemning him for making this argument, given those beliefs. However, the majority of people here think that the odds of AI apocalypse are much lower than 100%, so I will make the case from that perspective.

Nuclear blackmail doesn't work

Suppose Xi Jinping declares tomorrow that he he believed AI was an imminent existential threat. He then makes an ultimatum to the US: dismantle OpenAI and ban AI research within 6 months, or China will launch airstrikes on silicon valley datacentres.

I would estimate the chance of the US complying with this ultimatum to be ridiculously low (<1%). The reasoning used would be this:

There is a high chance that China is bluffing. If the US gives in to this ultimatum, then there is no reason that China can't make another ultimatum, and then another ultimatum, and then another one, effectively ceding sovereignty over it's nation to China. And not just China, other nations would see that this tactic worked, and join in the fun. And by increasing the number of bombing blackmail incidents, giving in might actually raise the chance of warfare more than by holding out would.

Well actually, it would probably involve more flag waving, swearing, and other displays of nationalism and patriotism. The official response would be something like "bugger off, we do what we want, if you bomb us we will bomb you back".

In fact, the response to the ultimatum is more likely to accelerate AI research, to see if AI could be used to defend against nukes (and safety will for sure be sacrificed). They would also probably start hiding datacentres where they are safe from airstrikes.

Now what happens when he ultimatum runs out? Either China backs down, in which case the ultimatum was worthless and actively counterproductive, or China bombs silicon valley, potentially starting nuclear Armageddon.

In this scenario, the risk of nuclear war has been raised significantly, but the risk of AI extinction has not been reduced at all. In fact it has arguably been increased, by making people more desperate.

Now, the actual case proposed would not be a single nation announcing a threat out the blue. It would involve multiple nations signing treaties together, precommitting to attacks as a last resort. But what if a nuclear armed country doesn't sign on to the treaty, and doesn't respond to other avenues of negotiation? In effect, you're back to the scenario above, which as I've explained, doesn't work at all.

2. If the world takes AI risk seriously, do we need threats?

A world in which most countries take AI risk seriously enough to risk nuclear war over is a very different world to the one we live in today. With that level of concern, the opportunities for alternate plans are massive. We could pour billions or even trillions into alignment research, and have orders of magnitude more people working on the problem, including the best of the best in every field. If the world does go down the "ban clusters" route, then there are plenty of nonviolent options available for dealing with rogue nations, given the massive amounts of resources available. For example, we could impose sanctions, try and cut their internet access, their semiconductor chip supply, etc. I'm not advocating for these things myself, but I am pointing out they are far preferable to risking nuclear war, and are available options in such a world.

Your estimate of x-risk may be high, in todays world. But the chances of AI x-risk are conditional on what humanity actually does. In the world described above, I think most peoples estimates would be substantially lower than they are now. This makes the math supporting nuclear brinksmanship even worse.

3. Don't do morally wrong things

In the wake of the FTX fraud scandal, EA has repeatedly tried to emphasize that you shouldn't do ethically dubious things in the pursuit of EA goals. I think the proposals above would involve doing ethically dubious things. For example, if a small country like Uganda tried to build an AI for medical discoveries to try and cure a disease plaguing their country, then it is advocated that outsiders that they have not signed a treaty with would order them to stop their research under threat of war and bombing, and then actually bomb them if they don't comply. It's true that we could try and compensate them in other ways, but I still find this morally wrong.

4. Nuclear exchanges could be part of a rogue AI plan

If we are in a world which already has a scheming AI that wants to kill us all, then "start a nuclear war" seems like a fairly obvious move for it to make, assuming it has planned ahead. This is especially the case if the worlds government are currently all cooperating to suppress your available resources. By blowing everything up, as long as the AI can survive the exchange somewhere, it has a pretty good chance of finishing the job.

With this in mind, if we are concerned about AI killing us all, we should also be reducing the ease of nuclear exchanges. Putting the line for nuclear brinksmanship at "a country makes GPU clusters" makes the job of an AI incredibly easy. They don't even have to build any clusters in a nuclear armed nation. All they have to do is launch an Iraq-war WMD style disinformation campaign that convinces the world that clusters exist, and watch as we militarily demand that Russia dismantle their non-existent hidden GPU clusters.

Conclusion

I hope that the arguments above are enough to convince people that a policy of threatening to bomb clusters in nuclear armed nations is a bad idea that should not be pursued. It's possible that Eliezer was not even arguing for this, but since the idea has now been floated by implication, I think it's important to refute it.

19 Reactions

More posts like this

Comments8

Sorted by

New & upvoted

Click to highlight new comments since: Today at 1:11 PM

GilMar 31 202317

With all due respect I think people are reading way too far into this, Eliezer was just talking about the enforcement mechanism for a treaty. Yes, treaties are sometimes (often? always?) backed up by force. Stating this explicitly seems dumb because it leads to posts like this, but let's not make this bigger than it is.

dsjMar 31 20239

It varies, but most treaties are not backed up by force (by which I assume we're referring to inter-state armed conflict). They're often backed up by the possibility of mutual tit-for-tat defection or economic sanction, among other possibilities.

LarksMar 31 202312

2. If the world takes AI risk seriously, do we need threats?

The world takes murder seriously, and part of taking murder seriously is enforcement mechanisms for its prohibition. If you break the law, the police will come to arrest you, and if you resist them they will shoot you. The (veiled) threat of violence is the foundation of basically all law, Eliezer is just unusually honest about this. If you're a hardcore anarchist who opposes the existence of state coercion in general then fair enough you can object to this, but for everyone else... This is how states deal with serious negative externalities: by imposing restrictions on activities that cause those negative externalities, backed up by the threat of irresistible violence that is the foundation of the Leviathan.

David Mathers🔸Apr 11 20234

This is a fully general argument for state violence in response to any threat. The author's arguments are specific to the case at hand and don't generalise to a case for absolute pacifism on the part of states.

RobertMMar 31 202312

I think you are somewhat missing the point. The point of a treaty with an enforcement mechanism which includes bombing data centers is not to engage in implicit nuclear blackmail, which would indeed be dumb (from a game theory perspective). It is to actually stop AI training runs. You are not issuing a "threat" which you will escalate into greater and greater forms of blackmail if the first one is acceded to; the point is not to extract resources in non-cooperative ways. It is to ensure that the state of the world is one where there is no data center capable of performing AI training runs of a certain size.

The question of whether this would be correctly understood by the relevant actors is important but separate. I agree that in the world we currently live in, it doesn't seem likely. But if you in fact lived in a world which had successfully passed a multilateral treaty like this, it seems much more possible that people in the relevant positions had updated far enough to understand that whatever was happening was at least not the typical realpolitik.

2. If the world takes AI risk seriously, do we need threats?

Obviously if you live in a world where you've passed such a treaty, the first step in response to a potential violation is not going to be "bombs away!", and nothing Eliezer wrote suggests otherwise. But the fact that you have these options available ultimately bottoms out in the fact that your BATNA is still to bomb the data center.

3. Don't do morally wrong things

I think conducting cutting edge AI capabilities research is pretty immoral, and in this counterfactual world that is a much more normalized position, even if consensus is that chances of x-risk absent a very strong plan for alignment is something like 10%. You can construct the least convenient possible world such that some poor country has decided, for perfectly innocent reasons, to build data centers that will predictably get bombed, but unless you think the probability mass on something like that happening is noticeable, I don't think it should be a meaningful factor in your reasoning. Like, we do not let people involuntarily subject others to russian roulette, which is similar to the epistemic state of the world where 10% x-risk is a consensus position, and our response to someone actively preparing to go play roulette while declaring their intentions to do so in order to get some unrelated real benefit out of it would be to stop them.

4. Nuclear exchanges could be part of a rogue AI plan

I mean, no, in this world you're already dead, and also nuclear exchange would in fact cost AI quite a lot so I expect many fewer nuclear wars in worlds where we've accidentally created an unaligned ASI.

titotalApr 1 20239

I think you are somewhat missing the point. The point of a treaty with an enforcement mechanism which includes bombing data centers is not to engage in implicit nuclear blackmail, which would indeed be dumb (from a game theory perspective). It is to actually stop AI training runs. You are not issuing a "threat" which you will escalate into greater and greater forms of blackmail if the first one is acceded to; the point is not to extract resources in non-cooperative ways. It is to ensure that the state of the world is one where there is no data center capable of performing AI training runs of a certain size.

The counterfactual here is between two treaties that are identical, except one includes the policy "bomb datacentres in nuclear armed nations" and one does not. The only case where they differ is the scenario where a nuclear armed nation starts building GPU clusters. In which case, policy A demands resorting to nuclear blackmail when all other avenues have been exhausted, but policy B does not.

I think a missing ingredient here is the scenario that led up to this policy. If there had already been a warning shot where an AI built in a GPT-4 sized cluster killed millions of people, then it is plausible that such a clause might work, because both parties are putting clusters in the "super-nukes" category.

If this hasn't happened, or the case for clusters being dangerous is seen as flimsy, then we are essentially back at the "china threatens to bomb openAI" scenario. I think this is a terrible scenario, unless you actually do think that nuclear war is preferable to large data-clusters being built. (to be clear, i think the chance of each individual data cluster causing the apocalypse is miniscule).

Daniel_EthMar 31 202310

Nuclear blackmail doesn't work
Suppose Xi Jinping declares tomorrow that he he believed AI was an imminent existential threat. He then makes an ultimatum to the US: dismantle OpenAI and ban AI research within 6 months, or China will launch airstrikes on silicon valley datacentres.

I wasn't imagining Eliezer was proposing that the government immediately threaten nuclear strikes unless datacenters are taken down, I was imagining instead that he was proposing governments make their top priority "make sure datacenters are taken down" and then from that, follow whatever level of escalation was needed for that to happen (basically, once there was a clear policy, begin with asking nicely, then use diplomatic efforts, then economic efforts like sanctions, then make military threats, then, if needed, follow through on those military threats, using both carrots and sticks along this escalatory ladder). Of course, it would be foolish to begin with the highest rung on the ladder, and it would obviously be much preferable to never reach the higher levels.

2. If the world takes AI risk seriously, do we need threats?

Hopefully not! No one is saying "we should use military threats, whether they're needed or not", but instead that the government should be willing to escalate to that level if necessary. I don't think it's crazy to imagine a world where most countries take the risk seriously, but a few rogue nations continue on ahead despite the efforts of the rest of the world to stop them.

3. Don't do morally wrong things

I don't think it's necessarily immoral for governments to enforce international treaties against WMDs (and similar) with the use of force once other avenues have been exhausted. I stand by this statement even if the enforcement is against a rogue nation that hasn't signed the treaty themselves. Having said that, the specifics matter here – there's obviously a history of governments using the false pretense of WMDs to carry out immoral military actions for other purposes (like geopolitical struggles).

4. Nuclear exchanges could be part of a rogue AI plan
If we are in a world which already has a scheming AI that wants to kill us all, then "start a nuclear war" seems like a fairly obvious move for it to make, assuming it has planned ahead.

I find it very unlikely that nuclear exchange would counterfactually be part of a rogue AI plan for human extinction. We're imagining an AI (or group of AIs) that can out-scheme all of humanity combined and that can continue to support itself in the absence of humanity, yet the only way it can do its job involves a nuclear exchange, and one based specifically on the assumed existence of GPU clusters?

UriKatzMay 16 20231

I applaud you for writing this post.

There is a huge difference between statement (a): "AI is more dangerous than nuclear war", and statement (b): "we should, as a last resort, use nuclear weapons to stop AI". It is irresponsible to downplay the danger and horror of (b) by claiming Yudkowsky is merely displaying intellectual honesty by making explicit what treaty enforcement entails (not the least because everyone studying or working on international treaties is already aware of this, and is willing to discuss it openly). Yudkowsky is making a clear and precise declaration of what he is willing to do, if necessary. To see this, one only needs to consider the opposite position, statement (c): "we should not start nuclear war over AI under any circumstance". Statement (c) can reasonably be included in an international treaty dealing with this problem, without that treaty loosing all enforceability. There are plenty of other enforcement mechanisms. Finally, the last thing anyone defending Yudkowsky can claim is that there is a low probability we will need to use nuclear weapons. There is a higher probability of AI research continuing, than of AI research leading to human annihilation. Yudkowsky is gambling that by threatening the use of force he will prevent a catastrophe, but there is every reason to believe his threats increase the chances of a similarly devastating catastrophe.