I don't think this is an accurate summary of Dario's stated views. Here's what he said in 2023 on the Dwarkesh podcast:
Dwarkesh Patel (00:27:49 - 00:27:56):
When you add all this together, what does your estimate of when we get something kind of human level look like?
Dario Amodei (00:27:56 - 00:29:32):
It depends on the thresholds. In terms of someone looks at the model and even if you talk to it for an hour or so, it's basically like a generally well educated human, that could be not very far away at all. I think that could happen in two or three years.
Here's what he said in a statement in February:
Possibly by 2026 or 2027 (and almost certainly no later than 2030), the capabilities of AI systems will be best thought of as akin to an entirely new state populated by highly intelligent people appearing on the global stage—a “country of geniuses in a datacenter”—with the profound economic, societal, and security implications that would bring.
These are different ideas, so I think it would be reasonable to have different timelines for "if you talk to it for an hour or so, it's basically like a generally well educated human" and "a country of geniuses in a datacenter." Nevertheless there's substantial overlap with these timelines, 18 months apart; he also uses language that signals some uncertainty at both points. I don't think this is particularly suspicious — it seems pretty consistent to me.
Thanks for sharing this fun paper!
I think I disagree with several key parts of the argument.
4. Professional philosophers are among the most educated and skeptical people on the planet. Yet, according to the 2020 PhilPapers survey, 18.83% of them accept or lean toward theism (too low due to selection effects?). 7.21% were agnostic. If we play it safe and suppose that only a third of the theist philosophers believe in hell, that’s about 6%. Thus, (on a very conservative estimate) about 6% of the most skeptical people on the planet believe in hell.
I think this makes a pretty important error in reasoning. Grant that philosophers in general are among the most skeptical people on the planet. Then you select a 6% segment of them. The generalization that these are still among the most skeptical people on the planet is erroneous. This 6% could have of (e.g.) average levels of skepticism, and it's the rest of the group that brings up the average level of skepticism of the group.
Here’s Jesus:
“When the Son of Man comes into his glory, and all the angels with him, then he will sit on his glorious throne. Before him will be gathered all the nations and he will separate people one from another as a shepherd separates the sheep from the goats. And he will place the sheep on his right, but the goats on the left. Then the King will say to those on his right, “Come you who are blessed by my Father, inherit the kingdom prepared for you from the foundation of the world. For I was hungry and you gave me food…
Then he will say to those on his left [the goats], Depart from me you cursed, into the eternal fire prepared for the devil and his angels. For I was hungry and you gave me no food…Truly, I say to you, as you did not do it to one of the least of these you did not do it to me. And these will go away into eternal punishment, but the righteous into eternal life.” (Matt 25:31-46)
This is among the passages commonly interpreted as Jesus discussing hell. However, note that it doesn't actually show Jesus discussing hell as we've been thought to think of it. First, he's clearly speaking in metaphor — he's not talking about literal sheep and goats. It's not clear what the "eternal punishment" he's referring to is. Some people interpret this as more of a "final" punishment, e.g. death, rather than eternal suffering. And indeed, if Jesus were referring to hell as traditionally conceived, I'd expect him to be clearer about this.
Many scholars on the topic have written extensively about this. My understanding is that there's little solid basis for getting the traditionally understood concept of hell out of the core ancient sources. And I'd expect, if it were true, and Jesus really were communicating about something as important as hell with divine knowledge, there would be no ambiguity about it. (Since the Quran comes after and is influenced by Christian sources, I don't think we should read it as a separate source of evidence.)
I think this is a very strong reason to doubt the plausibility of hell. And there are many other such reasons:
The weight of these considerations drives the plausibility of hell extremely low, much lower in my view than the possibility of x-risk from risks like nuclear weapons, pandemics, AI, or even natural sources like asteroids (which, unlike hell, we know exist and have previously impacted the lives of species).
I think this does make the odds of a religious catastrophe pascalian, and worth rejecting on that basis.
Even if the risk weren't pascalian, I think there's another problem with this argument, with reference to this part of the argument:
Each religion has infinite stakes, so the expected (dis)value of each is equal.
- Suppose I offer you one of two lottery tickets with the same payoff:
Ticket 1: Provides a 1/10,000 probability of infinite bliss, or
Ticket 2: Provides a 1/3 probability of infinite bliss.
- The expected value of selecting each ticket is infinite (therefore, equal). Are you indifferent? No.
- Lesson: When payoffs are equal, choose the most probable option.
- EAs already do this with catastrophic risks. They prioritize based on probabilities.
- Practical Upshot: Devote resources to religions in proportion to probabilities. Most resources to most probably religion, second-most resources to second-most probable religion, etc.
The problem here is that if you advocate for the wrong religion, you might increase the chance people go to hell, because some religions think believing in another religion would make you go to hell. So actions on this basis have to grapple with the possibilities of infinite bliss and infinite suffering, and we often might have just as much reason to think we're increasing one or decreasing the other. And since there's no reliable method for coming to consensus on these kinds of religious questions, we should think a problem like "reduce the probability people will go to hell" — even if the risk level wasn't pascalian — is entirely intractable.
What a belief implies about what someone does depends on many other things, like other beliefs and their options in the world. If, e.g., there are more opportunities to work on x-risk reduction than s-risk reduction, then it might be true that optimistic longtermists are less likely than pessimsitic longtermists to form families (because they're more focused on work) than pessimistic longtermists.
Having clarified that, do you really not find optimistic longtermism more evolutionarily adaptive than pessimistic longtermism?
As my answer made clear, the point I really want to emphasise is that this feels like an absurd exercise — there's no reason to believe that longtermist beliefs are heritable or selected for in our ancestral environment.
Yes, I do think this: "Not optimistic longtermism is at least just as evolutionarily debunkable as optimistic longtermism."
That's what I think our prior should be, and generally we shouldn't accept evolutionary debunking arguments for moral beliefs unless there's actual findings in evolutionary psychology that suggest evolutionary pressure is the best explanation for them. I think it's indeed trivially easy to come up with some story for why any given belief is subject to evolutionary debunking, but these stories are so easy to come up with that they provide essentially no meaningful evidence that the debunking is warranted, unless further substantiated.
E.g., I think the claim that pessimistic longtermism is evolutionarily selected for, because it would cause people to care more about their own families and kin than about far-off generations, is at least as plausible as your claim about optimistic longtermism. Or we might think agnostic longtermism is selected for, because we're cognitive misers and thinking about the long-term future is too intensive and not decision relevant to be selected for. In fact, I think none of these claims is very plausible at all, because I don't think it's likely evolution is selecting for these kinds of beliefs at this level of detail.
My argument about neutrality toward creating lives also counts against your claim, because if it were true that there was evolutionary pressure toward pro-natalist, optimistic longtermism, I would predict we'd not see intuitions for neutrality about creating future lives be so prevalent. But they are prevalent, so this is another reason I don't think your claim is plausible.
I mean... it's quite easy. There were people who, for some reason, were optimistic regarding the long-term future of humanity and they had more children than others (and maybe a stronger survival drive), all else equal. The claim that there exists such a selection effect seems trivially true.
I agree that you can construct hypothetical scenarios in which a given trait is selected for (though even then you have to postulate that it's heritable, which you didn't specify here). But your claim is is not trivially true, and it does not establish that optimism regarding the long-term future of humanity has in fact been selected for in human evolutionary history. Other beliefs that are more plausibly susceptible to evolutionary debunking include the idea that we have special obligations to our family members, since these are likely connected to kinship ties that have been widely studied across many species.
So I think a key crux between us is on the question: what does it take for a belief to be vulnerable to evolutionary debunking? My view is that it should actually be established in the field of evolutionary psychology that the belief is best explained as the direct[1] product of our evolutionary history. (Even then, as I think you agree, that doesn't falsify the belief, but it gives us reason to be suspicious of it.)
I asked ChatGPT how evolutionary psychologists typically try to show that a psychological trait was selected for. Here was its answer:
Evolutionary psychologists aim to show that a psychological trait is a product of selection by demonstrating that it likely solved adaptive problems in our ancestral environment. They look for traits that are universal across cultures, appear reliably during development, and show efficiency and specificity in addressing evolutionary challenges. Evidence from comparative studies with other species, heritability data, and cost-benefit analyses related to reproductive success also support such claims. Altogether, these approaches help build a case that the trait was shaped by natural or sexual selection rather than by learning or cultural influence alone.
I think you might say that you don't have to show that a belief is best explain by evolutionary pressure, just that there's some selection for it. In fact, I don't think you've done that (because e.g. you have to show that it's heritable). But I think that's not nearly enough, because "some evolutionary pressure toward belief X" is a claim we can likely make about any belief at all. (E.g., pessimism about the future can be very valuable, because it can make you aware of potential dangers that optimists would miss.)
Also, in response to this:
On person-affecting beliefs: The vast majority of people holding these are not longtermists to begin with. What we should be wondering is "to the extent that we have intuitions about what is best for the long-term (and care about this), where do these intuitions come from?". Non-longtermist beliefs are irrelevant, here. Hopefully, this also addresses your last bullet point.
I'm not sure why you think non-longtermist beliefs are irrelevant. Your claim is that optimistic longtermist beliefs are vulnerable to evolutionary debunking. But that would only be true if they were plausibly a product of evolutionary pressures which should apply to populations that have been subject to evolutionary selection; otherwise they're not a product of our evolutionary history. And so evidence of what humans generally are prone to believe seems highly relevant. The fact that many people, perhaps most, are pre-theoretically disposed toward views that push away from optimistic longtermism and pro-natalism casts further doubt on the claim that the intuitions that push people toward optimistic longtermism and pro-natalism have been selected for.
I used "direct" here because, in some sense, all of our beliefs are the product of our evolutionary history.
I don't think it's plausible that optimistic longtermism is vulnerable to evolutionary debunking, because:
I think if you were to turn this into an academic paper, I'd be interested to see if you could defend the claim that pro-natalist beliefs have been selected for in human evolutionary history.
Hi Rebecca,
Thanks for the question!
We did consider this as an option, and it's possible there are some versions of this we could do in the future, but it's not part of next steps at the moment. The basic reason is that this new strategic approach is the continuation of the direction 80k has been going for many years, so there’s not a segment of 80k with a separate focus to “spin-off” into a new entity.
Thanks for the additional context! I think I understand your views better now and I appreciate your feedback.
Just speaking for myself here, I think I can identify some key cruxes between us. I'll take them one by one:
I think the impact of most actions here is basically chaotic.
I disagree with this. I think it's better if people have a better understanding of the key issues raised by the emergence of AGI. We don't have all the answers, but we've thought about these issues a lot and have ideas about what kinds of problems are most pressing to address and what some potential solutions are. Communicating these ideas more broadly and to people who may be able to help is just better in expectation than failing to do so (all else equal), even though, as with any problem, you can't be sure you're making things better, and there's some chance you make things worse.
I also think "make the world better in meaningful ways in our usual cause areas before AGI is here" probably helps in many worlds, due to things like AI maybe trying to copy our values, or AI could be controlled by the UN or whatever and it's good to get as much moral progress in there as possible beforehand, or just updates on the amount of morally aligned training data being used.
I don't think I agree with this. I think the value of doing work in areas like global health or helping animals is largely in the direct impact of these actions, rather than any impact on what it means for the arrival of AGI. I don't think even if, in an overwhelming success, we cut malaria deaths in half next year, that will meaningfully increase the likelihood that AGI is aligned or that the training data reflects a better morality. It's more likely that directly trying to work to create beneficial AI will have these effects. Of course, the case for saving lives from malaria is still strong, because people's lives matter and are worth saving.
I think that more serious consideration of the Existential Risk Persuasion Tournament leads one to conclude that wildly transformational outcomes just aren't that likely in the short/medium term.
Recall that the XPT is from 2022, so there's a lot that's happened since. Even still, here's what Ezra Karger noted about expectations of the experts and forecasters views when we interviewed him on the 80k podcast:
One of the pieces of this work that I found most interesting is that even though domain experts and superforecasters disagreed strongly, I would argue, about AI-caused risks, they both believed that AI progress would continue very quickly.
So we did ask superforecasters and domain experts when we would have an advanced AI system, according to a definition that relied on a long list of capabilities. And the domain experts gave a year of 2046, and the superforecasters gave a year of 2060.
My understanding is that XPT was using the definition of AGI used in the Metaculus question cited in Niel's original post (though see his comment for some caveats about the definition). In March 2022, that forecast was around 2056-2058; it's now at 2030. The Metaculus question also has over 1500 forecasters, whereas XPT had around 30 superforecasters, I believe. So overall I wouldn't consider XPT to be strong evidence against short timelines.
I think there is some general "outside view" reason to be sceptical of short timelines. But I think there are good reasons to think that kind of perspective would miss big changes like this, and there is enough reason to believe short timelines are plausible to take action on that basis.
Again, thanks for engaging with all this!
One reason we use phrases “making AGI go well,” rather than some alternatives, is because 80k is concerned about risks like lock-in of really harmful values, in addition to human disempowerment and extinction risk — so I sympathise with your worries here.
Figuring out how to avoid these kinds of risks is really important, and recognising that they might arise soon is definitely within the scope of our new strategy. We have written about ways the future can look very bad even if humans have control of AI, for example here, here, and here.
I think it’s plausible to worry that not enough is being done about these kinds of concerns — that depends a lot on how plausible they are and how tractable the solutions are, which I don’t have very settled views on.
You might also think that there’s nothing tractable to do about these risks, so it’s better to focus on interventions that pay off in the short-term. But my view at least is that it is worth putting more effort into figuring out what the solutions here might be.
At 80,000 Hours, we published an article on this topic in 2023 by Benjamin Todd. It's a follow-up to Toby Ord's original work, and looks at other datasets and cause areas.
Benjamin concluded:
And also:
There's a ton more detail in the article.