In a recent article, Eliezer Yudkowsky advocates for the following measures to stop AI development:
Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for anyone, including governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
In this passage, Eliezer explicitly calls for airstrikes on "rogue datacentres", and for being willing to declare war on countries that build GPU clusters. The last sentences are somewhat vaguer, in that he does not specify what specific actions would both "run some risk of nuclear exchange" and "reduce the risk of large AI training runs". In the context of the previous passages, I think that the only interpretation that makes sense to me would be "threaten to bomb datacentres in nuclear armed countries". If Eliezer meant something else, I am open to being corrected and will edit this post. Regardless, I still think it's worth critiquing this idea, because some people have agreed with it, and I think it is a really, really, really bad idea.
Before I make my argument, I should say that the badness of the idea depends greatly on your beliefs about the inevitability of AI doom. I think it's plausible that this proposal makes sense for Yudkowsky, given his beliefs that p(doom) is near 100%. I'm not condemning him for making this argument, given those beliefs. However, the majority of people here think that the odds of AI apocalypse are much lower than 100%, so I will make the case from that perspective.
- Nuclear blackmail doesn't work
Suppose Xi Jinping declares tomorrow that he he believed AI was an imminent existential threat. He then makes an ultimatum to the US: dismantle OpenAI and ban AI research within 6 months, or China will launch airstrikes on silicon valley datacentres.
I would estimate the chance of the US complying with this ultimatum to be ridiculously low (<1%). The reasoning used would be this:
There is a high chance that China is bluffing. If the US gives in to this ultimatum, then there is no reason that China can't make another ultimatum, and then another ultimatum, and then another one, effectively ceding sovereignty over it's nation to China. And not just China, other nations would see that this tactic worked, and join in the fun. And by increasing the number of bombing blackmail incidents, giving in might actually raise the chance of warfare more than by holding out would.
Well actually, it would probably involve more flag waving, swearing, and other displays of nationalism and patriotism. The official response would be something like "bugger off, we do what we want, if you bomb us we will bomb you back".
In fact, the response to the ultimatum is more likely to accelerate AI research, to see if AI could be used to defend against nukes (and safety will for sure be sacrificed). They would also probably start hiding datacentres where they are safe from airstrikes.
Now what happens when he ultimatum runs out? Either China backs down, in which case the ultimatum was worthless and actively counterproductive, or China bombs silicon valley, potentially starting nuclear Armageddon.
In this scenario, the risk of nuclear war has been raised significantly, but the risk of AI extinction has not been reduced at all. In fact it has arguably been increased, by making people more desperate.
Now, the actual case proposed would not be a single nation announcing a threat out the blue. It would involve multiple nations signing treaties together, precommitting to attacks as a last resort. But what if a nuclear armed country doesn't sign on to the treaty, and doesn't respond to other avenues of negotiation? In effect, you're back to the scenario above, which as I've explained, doesn't work at all.
2. If the world takes AI risk seriously, do we need threats?
A world in which most countries take AI risk seriously enough to risk nuclear war over is a very different world to the one we live in today. With that level of concern, the opportunities for alternate plans are massive. We could pour billions or even trillions into alignment research, and have orders of magnitude more people working on the problem, including the best of the best in every field. If the world does go down the "ban clusters" route, then there are plenty of nonviolent options available for dealing with rogue nations, given the massive amounts of resources available. For example, we could impose sanctions, try and cut their internet access, their semiconductor chip supply, etc. I'm not advocating for these things myself, but I am pointing out they are far preferable to risking nuclear war, and are available options in such a world.
Your estimate of x-risk may be high, in todays world. But the chances of AI x-risk are conditional on what humanity actually does. In the world described above, I think most peoples estimates would be substantially lower than they are now. This makes the math supporting nuclear brinksmanship even worse.
3. Don't do morally wrong things
In the wake of the FTX fraud scandal, EA has repeatedly tried to emphasize that you shouldn't do ethically dubious things in the pursuit of EA goals. I think the proposals above would involve doing ethically dubious things. For example, if a small country like Uganda tried to build an AI for medical discoveries to try and cure a disease plaguing their country, then it is advocated that outsiders that they have not signed a treaty with would order them to stop their research under threat of war and bombing, and then actually bomb them if they don't comply. It's true that we could try and compensate them in other ways, but I still find this morally wrong.
4. Nuclear exchanges could be part of a rogue AI plan
If we are in a world which already has a scheming AI that wants to kill us all, then "start a nuclear war" seems like a fairly obvious move for it to make, assuming it has planned ahead. This is especially the case if the worlds government are currently all cooperating to suppress your available resources. By blowing everything up, as long as the AI can survive the exchange somewhere, it has a pretty good chance of finishing the job.
With this in mind, if we are concerned about AI killing us all, we should also be reducing the ease of nuclear exchanges. Putting the line for nuclear brinksmanship at "a country makes GPU clusters" makes the job of an AI incredibly easy. They don't even have to build any clusters in a nuclear armed nation. All they have to do is launch an Iraq-war WMD style disinformation campaign that convinces the world that clusters exist, and watch as we militarily demand that Russia dismantle their non-existent hidden GPU clusters.
Conclusion
I hope that the arguments above are enough to convince people that a policy of threatening to bomb clusters in nuclear armed nations is a bad idea that should not be pursued. It's possible that Eliezer was not even arguing for this, but since the idea has now been floated by implication, I think it's important to refute it.
I think you are somewhat missing the point. The point of a treaty with an enforcement mechanism which includes bombing data centers is not to engage in implicit nuclear blackmail, which would indeed be dumb (from a game theory perspective). It is to actually stop AI training runs. You are not issuing a "threat" which you will escalate into greater and greater forms of blackmail if the first one is acceded to; the point is not to extract resources in non-cooperative ways. It is to ensure that the state of the world is one where there is no data center capable of performing AI training runs of a certain size.
The question of whether this would be correctly understood by the relevant actors is important but separate. I agree that in the world we currently live in, it doesn't seem likely. But if you in fact lived in a world which had successfully passed a multilateral treaty like this, it seems much more possible that people in the relevant positions had updated far enough to understand that whatever was happening was at least not the typical realpolitik.
Obviously if you live in a world where you've passed such a treaty, the first step in response to a potential violation is not going to be "bombs away!", and nothing Eliezer wrote suggests otherwise. But the fact that you have these options available ultimately bottoms out in the fact that your BATNA is still to bomb the data center.
I think conducting cutting edge AI capabilities research is pretty immoral, and in this counterfactual world that is a much more normalized position, even if consensus is that chances of x-risk absent a very strong plan for alignment is something like 10%. You can construct the least convenient possible world such that some poor country has decided, for perfectly innocent reasons, to build data centers that will predictably get bombed, but unless you think the probability mass on something like that happening is noticeable, I don't think it should be a meaningful factor in your reasoning. Like, we do not let people involuntarily subject others to russian roulette, which is similar to the epistemic state of the world where 10% x-risk is a consensus position, and our response to someone actively preparing to go play roulette while declaring their intentions to do so in order to get some unrelated real benefit out of it would be to stop them.
I mean, no, in this world you're already dead, and also nuclear exchange would in fact cost AI quite a lot so I expect many fewer nuclear wars in worlds where we've accidentally created an unaligned ASI.