New article in Time Ideas by Eliezer Yudkowsky.
Here’s some selected quotes.
In reference to the letter that just came out (discussion here):
We are not going to bridge that gap in six months.
It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence—not perfect safety, safety in the sense of “not killing literally everyone”—could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone.
…
Some of my friends have recently reported to me that when people outside the AI industry hear about extinction risk from Artificial General Intelligence for the first time, their reaction is “maybe we should not build AGI, then.”
Hearing this gave me a tiny flash of hope, because it’s a simpler, more sensible, and frankly saner reaction than I’ve been hearing over the last 20 years of trying to get anyone in the industry to take things seriously. Anyone talking that sanely deserves to hear how bad the situation actually is, and not be told that a six-month moratorium is going to fix it.
Here’s what would actually need to be done:
The moratorium on new large training runs needs to be indefinite and worldwide. There can be no exceptions, including for governments or militaries. If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth. If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down.
Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for anyone, including governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
That’s the kind of policy change that would cause my partner and I to hold each other, and say to each other that a miracle happened, and now there’s a chance that maybe Nina will live. The sane people hearing about this for the first time and sensibly saying “maybe we should not” deserve to hear, honestly, what it would take to have that happen. And when your policy ask is that large, the only way it goes through is if policymakers realize that if they conduct business as usual, and do what’s politically easy, that means their own kids are going to die too.
Shut it all down.
We are not ready. We are not on track to be significantly readier in the foreseeable future. If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong.
Shut it down.
In light of this discussion about whether people would find this article alienating, I sent it to four very smart/reasonable friends who aren't involved in EA, don't work on AI, and don't live in the Bay Area (definitely not representative TIME readers, but maybe representative of the kind of people EAs want to reach). Given I don't work on AI/have only ever discussed AI risk with one of them, I don't think social desirability bias played much of a role. I also ran this comment by them after we discussed. Here's a summary of their reactions:
Friend 1: Says it's hard for them to understand why AI would want to kill everyone, but acknowledges that experts know much more about this than they do and takes seriously that experts believe this is a real possibility. Given this, they think it makes sense to err on the side of caution and drastically slow down AI development to get the right safety measures in place.
Friend 2: Says it's intuitive that AI being super powerful, not well understood, and rapidly developing is a dangerous combination. Given this, they think it makes sense to implement safeguards. But they found the article overwrought, especially given missing links in the argumen... (read more)
This comment was fantastic! Thanks for taking the time to do this.
In a world where the most prominent online discussants tend to be weird in a bunch of ways, we don't hear enough reactions from "normal" people who are in a mindset of "responding thoughtfully to a friend". I should probably be doing more friend-scanning myself.
Value of information seems to exceed the potential damage done at these sample sizes for me.
Given the typical correlation between upvotes and agreevotes, this is actually much more upvoted than you would expect (holding constant the disagreevotes).
I didn't actually downvote, but I did consider it, because I dislike PR-criticism of people for for disclosing true widely-available information in the process of performing a useful service.
It made it to the White House Press Briefing. This clip is like something straight out of the film Don't Look Up. Really hope that the ending is better (i.e. the warning is actually heeded).
When a very prominent member of the community is calling for governments to pre-commit to pre-emptive military strikes against countries allowing the construction of powerful AI in the relatively near-term, including against nuclear powers*, it's really time for people to actually take seriously the stuff about rejecting naive utilitarianism where you do crazy-sounding stuff if a quick expected value calcualtion makes it look maximizing.
*At least I assume that's what he means by being prepared to risk a higher chance of nuclear war.
Clarification for anyone who's reading this comment outside of having read the article – the article calls for governments to adopt clear policies involving potential preemptive military strikes in certain circumstances (specifically, against a hypothetical "rogue datacenter", as these datacenters could be used to build AGI), but it is not calling for any specific military strike right now.
Agreed! I think the policy proposal is a good one that makes a lot of sense, and I also think this is a good time to remind people that "international treaties with teeth are plausibly necessary here" doesn't mean it's open season on terrible naively consequentialist ideas that sound "similarly extreme". See the Death With Dignity FAQ.
This goes considerably beyond 'international treaties with teeth are plausibly necessary here':
'If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.'
Eliezer is proposing attacks on any countries that are building AI-above-a-certain-level, whether or not they sign up to the treaty. That is not a treaty enforcement mechanism. I also think "with teeth" kind of obscures by abstraction here (since it doesn't necessarily sound like it means war/violence, but that's what's being proposed.)
Is this actually inconsistent? If a country doesn't sign up for the Biological Weapons Convention, and then acts in flagrant disregard of it, would they not be expected to be faced with retaliatory action from signatories, including, depending on specifics, plausibly up to military force? My sense was that people who pushed for the introduction and enforcement of the BWC would have imaged such a plausible response as within bounds.
I don't think what Eliezer is proposing would necessarily mean war/violence either – conditional on a world actually getting to the point were major countries are agreeing to such a treaty, I find it plausible that smaller countries would simply acquiesce in shutting down rogue datacenters. If they didn't, before military force was used, diplomacy would be used. Then probably economic sanctions would be used. Eliezer is saying that governments should be willing to escalate to using military force if necessary, but I don't think it's obvious that in such a world military force would be necessary.
Yep, I +1 this response. I don't think Eliezer is proposing anything unusual (given the belief that AGI is more dangerous than nukes, which is a very common belief in EA, though not universally shared). I think the unusual aspect is mostly just that Eliezer is being frank and honest about what treating AGI development and proliferation like nuclear proliferation looks like in the real world.
He explained his reasons for doing that here:
I'm not sure I have too much to add, and I think that I do have concerns about how Eliezer wrote some of this letter given the predictable pushback it's seen, though maybe breaking the Overton Window is a price worth paying? I'm not sure there.
In any case, I just wanted to note that we have at least 2 historical examples of nations carrying out airstrikes on bases in other countries without that leading to war, though admittedly the nation attacked was not nuclear:
Both of these cases were a nation taking action somewhat unilaterally against another, destroying the other nation's capability with an airstrike, and what followed was not war but sabre-rattling and proxy conflict (note: That's my takeaway as a lay non-expert, I may be wrong about the consequences of these strikes! The consequences of Opera especially seem to be a matter of some historical debate).
I'm sure that there are other historical examples that could be found which shed light on what Eliezer's foreign polic... (read more)
Very hard hitting and emotional. I'm feeling increasingly like I did in February/March 2020, pre-lockdown. Full on broke down to tears after reading this. Shut it all down.
Strong agree, hope this gets into the print version (if it hasn't already).
Here’s a comment I shared on my LessWrong shortform.
——
I’m still thinking this through, but I am deeply concerned about Eliezer’s new article for a combination of reasons:
In the end, I expect this will just alienate people. And stuff like this concerns me.
I think it’s possible that the most memetically powerful approach will be to accelerate alignment rather than suggesting long-term bans or effectively antagonizing all AI use.
A couple of things make me inclined to disagree with you about whether this will alienate people, including:
1) The reaction on Twitter seems okay so far
2) Over the past few months, I've noticed a qualitative shift among non-EA friends/family regarding their concerns about AI; people seem worried
3) Some of the signatories of the FLI letter didn't seem to be the usual suspects; I have heard one prominent signatory openly criticize EA, so that feels like a shift, too
4) I think smart, reasonable people who have been exposed to ChatGPT but don't know much about AI—i.e., many TIME readers—intuitively get that "powerful thing we don't really understand + very rapid progress + lack of regulation/coordination/good policy" is a very dangerous mix
I'd actually be eager to hear more EAs talk about how they became concerned about AI safety, because I was persuaded that this was something we should be paying close attention to over the course of one long conversation, and it would take less convincing today. Maybe we should send this article to a few non-EA friends/family members and see what their reaction is?
So, things have blown up way more than I expected and things are chaotic. Still not sure what will happen or if a treaty is actually in the cards, but I’m beginning to see a path to tons of more investment in alignment potentially. One example why is that Jeff Bezos just followed Eliezer on Twitter and I think it may catch the attention of pretty powerful and rich people who want to see AI go well. We are so off-distribution, could go in any direction.
Wow, Bezos has indeed just followed Eliezer:
https://twitter.com/BigTechAlert/status/1641659849539833856
Related: "Amazon partners with startup Hugging Face for ChatGPT rival" (Los Angeles Times, Feb 21st 2023)
In case we have very different feeds, here’s a set of tweets critical about the article:
- https://twitter.com/mattparlmer/status/1641230149663203330?s=61&t=ryK3X96D_TkGJtvu2rm0uw (lots of quote-tweets on this one)
- https://twitter.com/jachiam0/status/1641271197316055041?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/finbarrtimbers/status/1641266526014803968?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/plinz/status/1641256720864530432?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/perrymetzger/status/1641280544007675904?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/post_alchemist/status/1641274166966996992?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/keerthanpg/status/1641268756071718913?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/levi7hart/status/1641261194903445504?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/luke_metro/status/1641232090036600832?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/gfodor/status/1641236230611562496?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/luke_metro/status/1641263301169680386?s=61&t=ryK3X96D_TkGJtvu2rm0uw
- https://twitter.com/perrymetzger/status/1641259
... (read more)Yeah, I'm definitely not disputing that some people will be alienated by this. My basic reaction is just: AI safety people are already familiar with EY's takes; I suspect people like my parents will read this and be like "whoa, this makes some sense and is kind of scary." (With regard to differing feeds, I just put the link to the article into the Twitter search bar and sorted by latest. I still think the negative responses are a minority.)
I'm not particularly well informed about current EA discourse on AI alignment, but I imagine that two possible strategies are
Yudkowsky's article helps push on the latter approach. Making the public and governments more worried about AI risk does seem to me the most plausible way of slowing it down. If more people in the national-security community worry about AI risks, there could be a lot more attention to these issues, as well as the possibility of policies like limiting total computing power for AI training that only governments could pull off.
I expect a lot of AI developers would be angry about getting the public and governments more alarmed, but if the effort to raise alarm works well enough, then the AI developers will have to comply. OTOH, there's also a possible "boy who cried wolf" situation in which AI progress continues, nothing that bad happens for a few years, and then people assume the doomsayers were overreacting -- making it harder to ring alarm bells the next time.
In any social policy battle (climate change, racial justice, animal rights) there will be people who believe that extreme actions are necessary. It's perhaps unusual on the Al front that one of the highest profile experts is on that extreme, but it's still not an unusual situation. A couple of points in favour of this message having a net positive effect.
- I don't buy the debate that extreme arguments alienate people about the cause in general. This is a common assumption, but the little evidence we have suggests that extreme actions or talk might actually both increase visibility of the cause and increase support to more moderate groups. Anecdotally on the AI front @lilly seems to be seeing something similar too.
- On a rational front, if he is this sure of doom, his practical solution seems to make the most sense. It shows intellectual integrity. We can't expect someone to have a pdoom of 99% given the status quo, then just suggest better alignment strategies. From a scout mindset perspective, we need to put ourselves in the 99% doom shoes before dismissing this opinion as irrational, even if we strongly disagree with his pdoom.
- (Related to 1), I feel like AI ri
... (read more)People who know that they are outliers amongst experts in how likely they think X is (as I think being 99% sure of doom is, particular combined with short-ish timelines), should be cautious about taking extreme actions on the basis of an outlying view, even if they think they have performed a personal adjustment to down-weight their confidence to take account of the fact that other experts disagree, and still ended up north of 99%. Otherwise you get the problem that extreme actions are taken even when most experts think they will be bad. In that sense integrity of the kind your praising is actually potentially very bad and dangerous, even if there are some readings of "rational" on which it counts as rational.
Of course, what Eliezer is doing is not taking extreme actions, but recommending governments do so in certain circumstances, and that is much less obviously a bad thing to do, since govs will also hear from experts who are closer to the median expert.
Archive link
(More of a meta point somewhat responding to some other comments.)
It currently seems unlikely there will be a unified AI risk public communication strategy. AI risk is an issue that affects everyone, and many people are going to weigh in on it. That includes both people who are regulars on this forum and people who have never heard of it.
I imagine many people will not be moved by Yudkowsky's op ed, and others will be. People who think AI x-risk is an important issue but who still disagree with Yudkowsky will have their own public writing that may be partially contradictory. Of course people should continue to talk to each other about their views, in public and in private, but I don't expect that to produce "message discipline" (nor should it).
The number of people concerned about AI x-risk is going to get large enough (and arguably already is) that credibility will become highly unevenly distributed among those concerned about AI risk. Some people may think that Yudkowsky lacks credibility, or that his op ed damages it, but that needn't damage the credibility of everyone who is concerned about the risks. Back when there were only a few major news articles on the subject, that might have been more true, but it's not anymore. Now everyone from Geoffrey Hinton to Gary Marcus (somehow) to Elon Musk to Yuval Noah Harari are talking about the risks. While it's possible everyone could be lumped together as "the AI x-risk people," at this point, I think that's a diminishing possibility.
I hope that this article sends the signals that pausing the development of the largest AI-models is good, informing society about AGI xrisk is good, and that we should find a coordination method (regulation) to make sure we can effectively stop training models that are too capable.
What I think we should do now is:
1) Write good hardware regulation policy proposals that could reliably pause the development towards AGI.
2) Campaign publicly to get the best proposal implemented, first in the US and then internationally.
This could be a path to victory.
I appreciate Eliezer's honesty and consistency in what he is is calling for. This approach makes sense if you believe, as Eliezer does, that p(doom | business as usual)>99%. Then it is worth massively increasing the risk of a nuclear war. If you believe, as I do and as most AI experts do, that p(doom | business as usual) <20%, this plan is absolutely insane.
This line of thinking is becoming more and more common in EA. It is going to get us all killed if it has any traction. No, the U.S. should not be willing to bomb Chinese data centers and risk a global nuclear war. No, repeatedly bombing China for pursing something that is a central goal of the CCP that has dangers that are completely illegible to 90% of the population is not a small, incremental risk of nuclear war on the scale of aiding Ukraine as some other commenters are suggesting. This is insane.
By all means, I support efforts for international treaties. Bombing Chinese data centers is suicidal and we all know it.
I say this all as someone who is genuinely frightened of AGI. It might well kill us, but not as quickly or surely as implementing this strategy will.
Edited to reflect that upon further thought, I probably do not support bombing the data centers of less powerful countries either.
We can still have this ending (cf. Terminator analogies):
Yudkowsky's suggestions seem entirely appropriate if you truly believe, like him, that AI x-risk is probability ~100%.
However, that belief is absurdly high, based on unproven and unlikely assumptions, like that an AI could build nanofactories by ordering proteins to be mixed over email.
In the actual world, where the probability of extinction is signficantly less than 100%, are these proposals valuable? It seems like they will just get everyone else labelled luddites and fearmongerers, especially if years and decades go by with no apocalypse in sight.
Many things about this comment seem wrong to me.
These proposals would plausibly be correct (to within an order of magnitude) in terms of the appropriate degree of response with much lower probabilities of doom (i.e. 10-20%). I think you need to actually run the math to say that this doesn't make sense.
This is a deeply distorted understanding of Eliezer's threat model, which is not any specific story that he can tell, but the brute fact that something smarter than you (and him, and everyone else) will come up with something better than that.
I do not think it is ever particularly useful to ask "is someone else's conclusion valid given my premises, which are importantly different from theirs", if you are attempting to argue against someone's premises. Obviously "A => B" & "C" does not imply "B", and it especial... (read more)
I really don't like the rhetorical move you're making here. You (as well as many people on this forum) think his beliefs are incorrect; others on this forum think they are correct. Insofar as there's no real consensus for which side is correct, I'd strongly prefer people (on both sides) use language like "given his, in my opinion, incorrect beliefs" as opposed to just stating as a matter of fact that he's incorrect.
All else equal, this depends on what increase in risk of nuclear war you're trading off against what decrease in x-risk from AI. We may have "increased" risk of nuclear war by providing aid to Ukraine in its war against Russia, but if it was indeed an increase it was probably small and worth the trade-off[1] against our other goals (such as disincentivizing the beginning of wars which might lead to nuclear escalation in the first place). I think approximately the only unusual part of Eliezer's argument is the fact that he doesn't beat around the bush in spelling out the implications.
Asserted for the sake of argument; I haven't actually demonstrated that this is true but my point is more that there are many situations where we behave as if it is obviously a worthwhile trade-off to marginally increase the risk of nuclear war.
I don't think the crux here is about nanofactories – I'd imagine that if Eliezer considered a world identical to ours but where nanofactories were impossible, he'd place (almost) as high probability on doom (though he'd presumably expect doom to be somewhat more drawn out).
This proposal seems to have become extremely polarizing, more so and for different reasons than I would have expected after first reading this. I am more on the “this is pretty fine” side of the spectrum, and think some of the reasons it has been controversial are sort of superficial. Given this though, I want to steelman the other side (I know Yudkowsky doesn’t like steelmanning, too bad, I do), with a few things that are plausibly bad about it that I don’t think are superficial or misreadings, as well as some start of my reasons for worrying less about t... (read more)
Yudkowsky claims that AI developers are plunging headlong into our research in spite of believing we are about to kill all of humanity. He says each of us continues this work because we believe the herd will just outrun us if any one of us were to stop.
The truth is nothing like this. The truth is that we do not subscribe to Yudkowsky’s doomsday predictions. We work on artificial intelligence because we believe it will have great benefits for humanity and we want to do good for humankind.
We are not the monsters that Yudkowsky makes us out to be.
I believe you that you're honestly speaking for your own views, and for the views of lots of other people in ML. From experience, I know that there are also lots of people in ML who do think AGI is likely to kill us all, and choose to work on advancing capabilities anyway. (With the justification Eliezer highlighted, and in many cases with other justifications, though I don't think these are adequate.)
I'd be interested to hear your views about this, and why you don't think superintelligence risk is a reason to pause scaling today. I can imagine a variety of reasons someone might think this, but I have no idea what your reason is, and I think conversation about this is often quite productive.
It's hard to have strong confidence in these numbers, but surveys of AI developers who publish at prestigious conferences on probability of AGI-caused "causing human extinction or similarly permanent and severe disempowerment of the human species? " often gets you numbers in the single-digit percentage points.
This is a meaningfully different claim than "likely to kill us all" which is implicitly >50%, but not that different in moral terms. The optimal level of extinction risk that humanity should be willing to incur is not 0, but it should be quite low.
I don't think I've met people working on AGI who has P(doom) >50%. I think I fairly often talk to people at e.g. OpenAI or DeepMind who believe it's 0.1%-10% however. And again, I don't find the difference that morally significant between probabilistically killing people at 5% vs 50% is that significant.
I don't know how useful it is to conceptualize AI engineers who actively believe >50% P(doom) as evil or "low-lifes", while giving a pass to people who have lower probabilities of doom. My guess is that it isn't, and it would be better if we have an honest perspective overall. Relatedly, it's better for people to be able to honestly admit "many people will see my work as evil but I'm doing it for xyz reasons anyway" rather than delude themselves otherwise and come up with increasingly implausible analogies, or refuse to engage at all.
I agree this is a confusing situation. My guess is most people compartmentalize and/or don't think of what they... (read more)
I’m going to go against the grain here, and explain how I truly feel about this sort of AI safety messaging.
As others have pointed out, fearmongering on this scale is absolutely insane to those who don’t have a high probability of doom. Worse, Elizier is calling for literal nuclear strikes and great power war to stop a threat that isn’t even provably real! Most AI researchers do not share his views, neither do I.
I want to publicly state that pushing this maximized narrative about AI x risk will lead to terrorist actions against GPU clusters or individuals ... (read more)
He's calling for a policy that would be backed by whatever level of response was necessary to enforce it, including, if it escalated to that level, military response (plausibly including nuclear). This is different from, right now, literally calling for nuclear strikes. The distinction may be somewhat subtle, but I think it's important to keep this distinction in mind during this discussion.
This statement strikes me as overconfident. While the narrative presumably does at least somewhat increase the personal security concerns of individuals involved in AI, I think we need to be able to have serious discussions on the topic, and public policy shouldn't be held hostage to worries that discussions about problems will somewhat increase the security concerns of those involved in those problems (e.g., certain leftist discourse presumably somewhat increases the personal security concerns of rich people, but I don't think that fact is a good argument against leftism or in favor of silencing leftists).
The full text of the TIME piece is now available on the EA Forum here, with two clarifying notes by Eliezer added at the end.
What if actors with bad intentions don't stop (and we won't be able to know about that), and they create a more powerful AI than what we have now?
Extremely cringe article.
The argument that AI will inevitably kill us has never been well-formed and he doesn't propose a good argument for it here. No-one has proposed a reasonable scenario by which immediate, unpreventable AI doom will happen (the protein nanofactories-by-mail idea underestimates the difficulty of simulating quantum effects on protein behaviour).
A human dropped into a den of lions won't immediately become its leader just because the human is more intelligent.