Keep this post on ice and uncork it when the bubble pops. It may mean nothing to you now; I hope it means something when the time comes.
This post is written with an anguished heart, from love, which is the only good reason to do anything.
I hope that the AI bubble popping is like the FTX collapse 2.0 for effective altruism. Not because it will make funding dry up — it won't. And not because it will have any relation to moral scandal — it won't. But it will be a financial wreck — orders of magnitude larger than the FTX collapse — that could lead to soul-searching for many people in effective altruism, if they choose to respond that way. (It may also have indirect reputational damage for EA by diminishing the credibility of the imminent AGI narrative — too early to tell.)
In the wake of the FTX collapse, one of the positive signs was the eagerness of people to do soul-searching. It was difficult, and it's still difficult, to know how to make sense of EA's role in FTX. Did powerful people in the EA movement somehow contribute to the scam? Or did they just get scammed too? Were people in EA accomplices or victims? What is the lesson? Is there one? I'll leave that to be sorted out another time. The point here is that people were eager to look for the lesson, if there was one to find, and to integrate it. That's good.
It's highly probable that there is an AI bubble.[1] Nobody can predict when a bubble will pop, even if they can correctly call that there is a bubble. So, we can only say that there is most likely a bubble and it will pop... eventually. Maybe in six months, maybe in a year, maybe in two years, maybe in three years... Who knows. I hope that people will experience the reverberations of that bubble popping — possibly even triggering a recession in the U.S., although it may be a bit like the straw that broke the camel's back in that case — and bring the same energy they brought to the FTX collapse. The EA movement has been incredibly bought-in on AI capabilities optimism and that same optimism is fueling AI investment. The AI bubble popping would be a strong signal that this optimism has been misplaced.
Unfortunately, it’s always possible to not learn lessons. The futurist Ray Kurzweil has made many incorrect predictions about the future. His strategy in many such cases is to find a way he can declare he was correct or "essentially correct" (see, e.g., page 132 here). Tesla CEO Elon Musk has been predicting every year for the past seven years or so that Teslas will achieve full autonomy — or something close to it — in a year, or next year, or by the end of the year. Every year it doesn't happen, he just pushes his prediction back a year. And he's done that about seven times. Every year since around 2018, Tesla's achievement of full autonomy (or something close) has been about a year away.
When the AI bubble pops, I fear both of these reactions. The Kurzweil-style reaction is to interpret the evidence in a way — any way — that allows one to be correct. There are a million ways of doing this. One way would be to tell a story where AI capabilities were indeed on the trajectory originally believed, but AI safety measures — thanks in part to the influence of AI safety advocates — led to capabilities being slowed down, held back, sabotaged, or left on the table in some way. This is not far off from the sorts of things people have already argued. In 2024, the AI researcher and investor Leopold Aschenbrenner published an extremely dubious essay, "Situational Awareness", which, in between made-up graphs, argues that AI models are artificially or unfairly "hobbled" in a way that makes their base, raw capabilities seem significantly less than they really are. By implementing commonsense, straightforward unhobbling techniques, models will become much more capable and reveal their true power. From here, it would only be one more step to say that AI companies deliberately left their models "hobbled" for safety reasons. But this is just one example. There are an unlimited number of ways you could try to tell a story like this.
Arguably, Anthropic CEO Dario Amodei engaged in Kurzweil-style obfuscation of a prediction this year. In mid-March, Amodei predicted that by mid-September 90% of code would be written by AI. When nothing close to this happened, Amodei said, "Some people think that prediction is wrong, but within Anthropic and within a number of companies that we work with, that is absolutely true now." When pressed, he clarified that this was only true "on many teams, not uniformly, everywhere". That's a bailey within a bailey.
The Musk-style reaction is to just to kick the can down the road. People in EA or EA-adjacent communities have already been kicking the can down the road. AI 2027, which was actually AI 2028, is now AI 2029. And that's hardly the only example.[2] Metaculus was at 2030 on AGI early in the year and now it's at 2033.[3] The can is kicked.
There's nothing inherently wrong with kicking the can down the road. There is something wrong with the way Musk has been doing it. At what point does it cross over from making a reasonable, moderate adjustment to making the same mistake over and over? I don't think there's an easy way to answer this question. I think the best you can do is see repeated can kicks as an invitation to go back to basics, to the fundamentals, to adopt a beginner's mind, and try to rethink things from the beginning, over again. As you retrace your steps, you might end up in the same place all over again. But you might notice something you didn't notice before.
There are many silent alarms already ringing about the imminent AGI narrative. One of effective altruism's co-founders, the philosopher and AI governance researcher Toby Ord, wrote brilliantly about one of them. Key quote:
Grok 4 was trained on 200,000 GPUs located in xAI’s vast Colossus datacenter. To achieve the equivalent of a GPT-level jump through RL [reinforcement learning] would (according to the rough scaling relationships above) require 1,000,000x the total training compute. To put that in perspective, it would require replacing every GPU in their datacenter with 5 entirely new datacenters of the same size, then using 5 years worth of the entire world’s electricity production to train the model. So it looks infeasible for further scaling of RL-training compute to give even a single GPT-level boost.
The respected AI researcher Ilya Sutskever, who played a role in kicking off the deep learning revolution in 2012 and who served as OpenAI's Chief Scientist until 2024, has declared that the age of scaling in AI is over, and we have now entered an age of fundamental research. Sutskever highlights “inadequate” generalization as a flaw with deep neural networks and has previously called out out reliability as an issue. A survey from earlier this year found that 76% of AI experts think it's "unlikely" or "very unlikely" that scaling will lead to AGI.[4]
And of course the signs of the bubble are also signs of trouble for the imminent AGI narrative. Generative AI isn't generating profit. For enterprise customers, it can't do much that's practically useful or financially valuable. Optimistic perceptions of AI capabilities are based on contrived, abstract benchmarks with poor construct validity, not hard evidence about real world applications.[5] Call it the mismeasurement of the decade!
My fear is that EA is going to barrel right into the AI bubble, ignoring these obvious warning signs. I'm surprised how little attention Toby Ord's post has gotten. Ord is respected by all of us in this community and therefore he has a big megaphone. Why aren't people listening? Why aren't they noticing this? What is happening?
It's like EA is car blazing down the street at racing speeds, blowing through stop signs, running red lights... heading, I don't know where, but probably nowhere good. I don't know what can stop the momentum now, except maybe something on the scale that the macroeconomy of the United States will be shaken.
The best outcome would be for the EA community to deeply reflect and to reevaluate the imminent AGI narrative before the bubble pops; the second-best outcome would be to do this soul-searching afterward. So, I hope people will do that soul-searching, like the post-FTX soul-searching, but even deeper. 99%+ of people in EA had no direct personal connection to FTX. Evidence about what EA leaders knew and when they knew it was (and largely still is) scant, making it hard to draw conclusions, as much as people desperately (and nobly) wanted to find the lesson. Not so for AGI. For AGI, most people have some level of involvement, even if small, in shaping the community's views. Everyone's epistemic practices — not "epistemics", which is a made-up word that isn't used in philosophy — are up for questioning here, even for people who just vaguely think I don't really know anything about that but I'll just trust that the community is probably right.
The science communicator Hank Green has an excellent video from October where he explains some of the epistemology of science and why we should follow Carl Sagan's famous maxim that "extraordinary claims require extraordinary evidence". Hank Green is talking about evidence of intelligent alien life, but what he says applies equally well to intelligent artificial life. When we're encountering something unknown and unprecedented, our observations and measurements should be under a higher level of scrutiny than we accept for ordinary, everyday things. Perversely, the standard of evidence in AGI discourse is the opposite. Arguments and evidence that wouldn't even pass muster as part of an investment thesis are used to forecast the imminent, ultimate end of humanity and the invention of a digital God. What's the base rate of millennialist views being correct? 0.00%?
Watch the video and replace "aliens" with "AGI":
I feel crazy and I must not be the only one. None of this makes any sense. How did a movement that was originally about rigorous empirical evaluation of charity cost-effectiveness become a community where people accept eschatological arguments based on fake graphs and gut intuition? What?? What are you talking about?! Somebody stop this car!
And lest you misunderstand me, when I started my Medium blog back in 2015, my first post was about the world-historical, natural historical importance of the seemingly inevitable advent of AGI and superintelligence. On an older blog that no longer exists, posts on this theme go back even further. What a weird irony I find myself in now. The point is not whether AGI is possible in principle or whether it will eventually be created if science and technology continue making progress — it seems hard to argue otherwise — but that this is not the moment. It's not even close to the moment.
The EA community has a whiff of macho dunk culture at times (so does Twitter, so does life), so I want to be clear that's absolutely not my intention. I'm coming at this from a place of genuine maternal love and concern. What's going on, my babies? How did we get here? What happened to that GiveWell rigour?
Of course, nobody will listen to me now. Maybe when the bubble pops. Maybe. (Probably not.)
This post is not written to convince anyone today. It's written for the future. It's a time capsule for when the bubble pops. When that moment comes, it's an invitation for sober second thought. It's not an answer, but an unanswered question. What happened, you guys?
- ^
See "Is the AI Industry in a Bubble?" (November 15, 2025).
- ^
In 2023, 2024, and 2025, Turing Award-winning AI researcher Geoffrey Hinton repeated his low-confidence prediction of AGI in 5-20 years, but it might be taking him too literally to say he pushed back his prediction by 2 years.
- ^
The median date of AGI has been slipping by 3 years per year. If you update all the way, by 2033, it will have slipped to 2057.[6]
- ^
Another AI researcher, Andrej Karpathy, formerly at OpenAI and Stanford but best-known for playing a leading role in developing Tesla's Full Self-Driving software from 2017 to 2022, made a splash by saying that he thought effective "agentic" applications of AI (e.g. computer-using AI systems à la ChatGPT's Agent Mode) were about a decade away — because this implies Karpathy thinks AGI is at least a decade away. I personally didn't find this too surprising or particularly epistemically significant; Karpathy is far from the first, only, or most prominent AI researcher to say something like this. But I think this broke through a lot of people's filter bubbles because Karpathy is someone they listen to, and it surprised them because they aren't used to hearing even a modestly more conservative view than AGI by 2030, plus or minus two years.
- ^
Edited on Monday, December 8, 2025 at 12:05pm Eastern to add: I just realized I’ve been lumping in criterion validity with construct validity, but they are two different concepts. Both are important in this context. Both concepts fall under the umbrella of measurement validity.
- ^
If you think this meta-induction is ridiculous, you’re right.

I directionally agree that EAs are overestimating the imminence of AGI and will incur some credibility costs, but the bits of circumstantial evidence you present here don't warrant the confidence you express. 76% of experts saying it's "unlikely" the current paradigm will lead to AGI leaves ample room for a majority thinking there's a 10%+ chance it will, which is more than enough to justify EA efforts here.
And most of what EAs are working on is determining whether we're in that world and what practical steps you can take to safeguard value given what we know. It's premature to declare case closed when the markets and the field are still mostly against you (at the 10% threshold).
I wish EA were a bigger and broader movement such that we could do more hedging, but given that you only have a few hundred people and a few $100m/yr, it's reasonable to stake that on something this potentially important that no one else is doing effective work on.
I would like to bring back more of the pre-ChatGPT disposition where people were more comfortable emphasizing their uncertainty, but standing by the expected value of AI safety work. I'm also open to the idea that that modesty too heavily burdens our ability to have impact in the 10%+ of worlds where it really matters.
I agree that the OP is too confident/strongly worded, but IMO this
could be dangerously wrong. As long as AI safety consumes resources that might have counterfactually gone to e.g. nuclear disarmament, stronger international relations, it might well be harmful in expectation.
This is doubly true for warlike AI 'safety' strategies like Aschenbrenner's call to intentionally arms race China, Hendrycks, Schmidt and Wang's call to 'sabotage' countries that cross some ill-defined threshold, and Yudkowsky calling for airstrikes on data centres. I think such 'AI safety' efforts are very likely increasing existential risk.
A 10% chance of transformative AI this decade justifies current EA efforts to make AI go well. That includes the opportunity costs of that money not going to other things in the 90% worlds. Spending money on e.g. nuclear disarmament instead of AI also implies harm in the 10% of worlds where TAI was coming. Just calculating the expected vale of each accounts for both of these costs.
It's also important to understand that Hendrycks and Yudkowsky were simply describing/predicting the geopolitical equilibrium that follows from their strategies, not independently advocating for the airstrikes or sabotage. Leopold is a more ambiguous case, but even he says that the race is already the reality, not something he prefers independently. I also think very few "EA" dollars are going to any of these groups/individuals.
I don't think it's clear, absent further argument, that there has to be a 10% chance of full AGI in the relatively near future to justify the currently high valuations of tech stocks. New, more powerful models could be super-valuable without being able to do all human labour. (For example, if they weren't so useful working alone, but they made human workers in most white collar occupations much more productive.) And you haven't actually provided evidence that most experts think there's a 10% chance current paradigm will lead to AGI. Though the latter point is a bit of a nitpick if 24% of experts think it will, since I agree the latter is likely enough to justify EA money/concern. (Maybe the survey had some don't knows though?).
Thank you for pointing this out, David. The situation here is asymmetric. Consider the analogy of chess. If computers can’t play chess competently, that is strong evidence against imminent AGI. If computers can play chess competently — as IBM’s Deep Blue could in 1996 — that is not strong evidence for imminent AGI. It’s been about 30 years since Deep Blue and we still don’t have anything close to AGI.
AI investment is similar. The market isn’t pricing in AGI. I’ve looked at every analyst report I can find, and whatever other information I can get my hands on about how AI is being valued. The optimists are expecting AI to be a fairly normal, prosaic extension of computers and the Internet, enabling office workers to manipulate spreadsheets more efficiently, making it easier for consumers to shop online, they foresee social media platforms having chatbots that are somehow popular and profitable, LLMs playing some role in education, and chatbots doing customer support — which seems like one of the two areas, along with coding, where generative AI has some practical usefulness and financial value, although this is a fairly incremental step up from the pre-LLM chatbots and decision trees that were already widely used in customer support.
I haven’t seen AGI mentioned as a serious consideration in any of the stuff I’ve seen from the financial world.
I agree there's logical space for something less than less than AGI making the investments rational, but I think the gap between that and full AGI is pretty small. Peculiarity of my own world model though, so not something to bank on.
My interpretation of the survey responses is selecting "unlikely" when there are also "not sure" and "very unlikely" options suggests substantial probability (i.e. > 10%) on the part of the respondents who say "unlikely," or "don't know." Reasonable uncertainty is all you need to justify work on something so important if-true and the cited survey seems to provide that.
People vary a lot in how they interpret terms like "unlikely" or "very unlikely" in % terms, so I think >10% is not all that obvious. But I agree that it is evidence they don't think the whole idea is totally stupid, and that a relatively low probability of near-term AGI is still extremely worth worrying about.
I should link the survey directly here: https://aaai.org/wp-content/uploads/2025/03/AAAI-2025-PresPanel-Report-FINAL.pdf
The relevant question is described on page 66:
I frequently shorthand this to a belief that LLMs won’t scale to AGI, but the question is actually broader and encompasses all current AI approaches.
Also relevant for this discussion: pages 64 and 65 of the report describe some of the fundamental research challenges that currently exist in AI capabilities. I can’t emphasize the importance of this enough. It is easy to think a problem like AGI is closer to being solved than it really is when you haven’t explored the subproblems involved or the long history of AI researchers trying and failing to solve those subproblems.
In my observation, people in EA greatly overestimate progress on AI capabilities. For example, many people seem to believe that autonomous driving is a solved problem, when this isn’t close to being true. Natural language processing has made leaps and bounds over the last seven years, but the progress in computer vision has been quite anemic by comparison. Many fundamental research problems have seen basically no progress, or very little.
I also think many people in EA overestimate the abilities of LLMs, anthropomorphizing the LLM and interpreting its outputs as evidence of deeper cognition, while also making excuses and hand-waving away the mistakes and failures — which, when it’s possible to do so, are often manually fixed using a lot of human labour by annotators.
I think people in EA need to update on:
I agree that the "unlikely" statistic leaves ample room for the majority of the field thinking there is a 10%+ chance, but it does not establish that the majority actually thinks that.
I think there are at least two (potentially overlapping) ways one could take the general concern that @Yarrow Bouchard 🔸 is identifying here. One, if accepted, leads to the substantive conclusion that EA individuals, orgs, and funders shouldn't be nearly as focused on AI because the perceived dangers are just too remote. An alternative framing doesn't necessarily lead there. It goes something like there has been a significant and worrisome decline in the quality of epistemic practices surrounding AI in EA since the advent of ChatGPT. If it -- but not the other -- framing is accepted, it leads in my view to a different set of recommended actions.
I flag that since I think the relevant considerations for assessing the alternative framing could be significantly different.
One not need choose between the two because they both point toward the same path: re-examine claims with greater scrutiny. There is no excuse for the egregious flaws in works like "Situational Awareness" and AI 2027. This is not serious scholarship. To the extent the EA community gets fooled by stuff like this, its reasoning process, and its weighing of evidence, will be severely impaired.
If you get rid of all the low-quality work and retrace all the steps of the argument from the beginning, might the EA community end up in basically the same place all over again, with a similar estimation of AGI risk and a similar allocation of resources toward it? Well, sure, it might. But it might not.
If your views are largely informed by falsehoods and ridiculous claims, half-truths and oversimplifications, greedy reductionism and measurements with little to no construct validity or criterion validity, and, in some cases, a lack of awareness of countervailing ideas or the all-too-eager dismissal of inconvenient evidence, then you simply don’t know what your views would end up being if you started all over again with more rigour and higher standards. The only appropriate response is to clear house. Put the ideas and evidence into a crucible and burn away what doesn’t belong. Then, start from the beginning and see what sort of conclusions can actually be justified with what remains.
A large part of the blame lies at the feet of LessWrong and at the feet of all the people in EA who decided, in some important cases quite early on, to mingle the two communities. LessWrong promotes skepticism and suspicion of academia, mainstream/institutional science, traditional forms of critical thinking and scientific skepticism, journalism, and society at large. At the same time, LessWrong promotes reverence and obsequence toward its own community, positioning itself as an alternative authority to replace academia, science, traditional critical thought, journalism, and mainstream culture. Not innocently. LessWrong is obsessed with fringe thinking. The community has created multiple groups that Ozy Brennan describes as "cults". Given how small the LessWrong community is, I’d have to guess that the rate at which the community creates cults must be multiple orders of magnitude higher than the base rate for the general population.
LessWrong is also credulous about racist pseudoscience, and, in the words of a former Head of Communications at the Centre for Effective Altruism, is largely "straight-up racist". One of the admins of LessWrong and co-founders of Lightcone Infrastructure once said, in the context of a discussion about the societal myth that gay people are evil or malicious and a danger to children:
Such statements make "rationalist" a misnomer. (I was able to partially dissuade of him of this nonsense by showing him some of the easily accessible evidence he could have looked up for himself, but the community did not seem to particularly value my intervention.)
I don’t know that the epistemic practices of the EA community can be rescued as long the EA community remains interpenetrated with LessWrong to a major degree. The purpose of LessWrong is not to teach rationality, but to disable one’s critical faculties until one is willing to accept nonsense. Perhaps it is futile to clamour for better-quality scholarship when such a large undercurrent of the EA community is committed to the idea that normal ideas of what constitutes good scholarship are wrong and that the answers to what constitutes actually good scholarship lie with Eliezer Yudkowsky, an amateur philosopher with no relevant qualifications or achievements in any field, who frequently speaks with absolute confidence and is wrong, who experts often find non-credible, who has said he literally sees himself as the smartest person on Earth, and who rarely admits mistakes (despite making many) or issues corrections. If Yudkowsky is your highest and more revered authority, if you follow him in rejecting academia, institutional science, mainstream philosophy, journalism, normal culture, and so on, then I don’t know what could possibly convince you that the untrue things you believe are untrue, since your fundamental epistemology comes down to whether Yudkowsky says something is true or not, and he’s told you to reject all other sources of truth.
To the extent the EA community is under LessWrong’s spell, it will probably remain systemically irrational forever. Only within the portions of the EA community who have broken that spell, or never come under it in the first place, is there the hope for academic standards, mainstream scientific standards, traditional critical thinking, journalistic fact-checking, culturally evolved wisdom, and so on to take hold. It would be like expecting EA to be rational about politics while 30% of the community is under the spell of QAnon, or to be rational about global health while a large part of the community is under the spell of anti-vaccination pseudoscience. It’s just not gonna happen.
But maybe my root cause analysis is wrong and the EA community can course correct without fundamentally divorcing LessWrong. I don’t know. I hope that, whatever is the root cause, whatever it takes to fix it, the EA community’s current low standards for evidence and argumentation pertaining to AGI risk get raised significantly.
I don’t think it’s a brand new problem, by the way. Around 2016, I was periodically arguing with people about AI on the main EA group on Facebook. One of my points of contention was that MIRI’s focus on symbolic AI was a dead-end and that machine learning had empirically produced much better results, and was where the AI field was now focused. (MIRI took a long time before they finally hired their first researcher to focus on machine learning.) I didn’t have any more success convincing people about that back then than I’ve been having lately with my current points of contention.
I agree though that the situation seems to have gotten much worse in recent years, and ChatGPT (and LLMs in general) probably had a lot to do with that.
I don't think EAs AI focus is a product only of interaction with Less Wrong,-not claiming you said otherwise-but I do think people outside the Less Wrong bubble tend to be less confident AGI is imminent, and in that sense less "cautious".
I think EAs AI focus is largely a product of the fact that Nick Bostrom knew Will and Toby when they were founding EA, and was a big influence on their ideas. Of course, to some degree this might be indirect influence from Yudkowsky since he was always interacting with Nick Bostrom, but it's hard to know in what direction the influenced flowed here. I was around in Oxford during the embryonic stages of EA, and while I was not involved-beyond being a GWWC member, I did have the odd conversation with people who were involved, and my memory is that even then, people were talking about X-risk from AI as a serious contender for the best cause area, as early as at least 2014, and maybe a bit before that. They -EDIT: by "they" here I mean, "some people in Oxford, I don't remember who"; don't know when Will and Toby specifically first interacted with LW folk-were involved in discussion with LW people, but I don't think they got the idea FROM LW. Seems more likely to me they got it from Bostrom and the Future of Humanity Institute, who were just down the corridor.
What is true is that Oxford people have genuinely expressed much more caution about timelines. I.e. in What We Owe the Future, published as late as 2022, Will is still talking about how AGI might be more than 50 years, away but also "it might come soon-within the next fifty or even twenty years." (If you're wondering what evidence he cites, it's the Cotra bioanchors report.) His discussion primarily emphasizes uncertainty about exactly when AGI will arrive, and how we can't be confident it's not close. He cites a figure from an Open Phil report guessing an 8% chance of AGI by 2036*. I know you're view is that this is all wildly wrong still, but it's quite different from what many-not all-Less Wrong people say, who tend to regard 20 years as a long time line. (Maybe Will has updated to shorter timelines since of course.)
I think there is something of a divide between people who believe strongly in a particular set of LessWrong derived ideas about the imminence of AGI, and another set of people who are mainly driven by something like "we should take positive EV bets with a small chance of paying off, and doing AI stuff just in case AGI arrives soon". Defending the point about taking positive EV bets with only a small chance of pay-off is what a huge amount of the academic work on Longtermism at the GPI in Oxford was about. (This stuff definitely has been subjected to-severe-levels of peer reviewed scrutiny, as it keeps showing up in top philosophy journals with rejection rates of like, 90%.)
*This is more evidence people were prepared to bet big on AI risk long before the idea that AGI is actually imminent became as popular as it is now. I think people just rejected the idea that useful work could only be done when AGI was definitely near, and we had near-AGI models.
eh, I think the main reason EAs believe AGI stuff is reasonably likely is because this opinion is correct, given the best available evidence[1].
Having a genealogical explanation here is sort of answering the question on the wrong meta-level, like giving a historical explanation for "why do evolutionists believe in genes" or telling a touching story about somebody's pet pig for "why do EAs care more about farmed animal welfare than tree welfare."
Or upon hearing "why does Google use ads instead of subscriptions?" answering with the history of their DoubleClick acquisition. That history is real, but it's downstream of the actual explanation: the economics of internet search heavily favor ad-supported models regardless of the specific path any company took. The genealogy is epiphenomenal.
The historical explanations are thus mildly interesting but they conflate the level of why.
EDIT: man I'm worried my comment will be read as a soldier-mindset thing that only makes sense if you presume the "AGI likely soon" is already correct. Which does not improve on the conversation. Please only upvote it iff a version of you that's neutral on the object-level question would also upvote this comment.
Which is a different claim from whether it's ultimately correct. Reality is hard.
Yeah, it's fair objection that even answer the why question like I did presupposes that EAs are wrong, or at least, merely luckily right. (I think this is a matter of degree, and that EAs overrated the imminence of AGI and the risk of takeover on average, but it's still at least reasonable to believe AI safety and governance work can have very high expected value for roughly the reasons EAs do.) But I was responding to Yarrow who does think that EAs are just totally wrong, so I guess really I was saying that "conditional on a sociological explanation being appropriate, I don't think it's as LW-driven as Yarrow thinks", although LW is undoubtedly important.)
Right, to be clear I'm far from certain that the stereotypical "EA view" is right here.
Sure that makes a lot of sense! I was mostly just using your comment to riff on a related concept.
I think reality is often complicated and confusing, and it's hard to separate out contingency vs inevitable stories for why people believe what they believe. But I think the correct view is that EAs' belief on AGI probability and risk (within an order of magnitude or so) is mostly not contingent (as of the year 2025) even if it turns out to be ultimately wrong.
The Google ads example was the best example I could think of to illustrate this. I'm far from certain that Google's decision to use ads was actually the best source of long-term revenue (never mind being morally good lol). But it still seemed like the internet as we understand it meant it was implausible that Google ads was counterfactually due to their specific acquisitions.
Similarly, even if EAs ignored AI before for some reason, and never interacted with LW or Bostrom, it's implausible that, as of 2025, people who are concerned with ambitious, large-scale altruistic impact (and have other epistemic, cultural, and maybe demographic properties characteristic of the movement) would not think of AI as a big deal. AI is just a big thing in the world that's growing fast. Anybody capable of reading graphs can see that.
That said, specific micro-level beliefs (and maybe macro ones) within EA and AI risk might be different without influence from either LW or the Oxford crowd. For example there might be a stronger accelerationist arm. Alternatively, people might be more queasy with the closeness with the major AI companies, and there will be a stronger and more well-funded contingent of folks interested in public messaging on pausing or stopping AI. And in general if the movement didn't "wake up" to AI concerns at all pre-ChatGPT I think we'd be in a more confused spot.
How many angels can dance on the head on a pin? An infinite number because angels have no spatial extension? Or maybe if we assume angels have a diameter of ~1 nanometre plus ~1 additional nanometre of diameter for clearance for dancing we can come up with a ballpark figure? Or, wait, are angels closer to human-sized? When bugs die do they turn into angels? What about bacteria? Can bacteria dance? Are angels beings who were formerly mortal, or were they "born” angels?[1]
Well, some of the graphs are just made-up, like those in "Situational Awareness", and some of the graphs are woefully misinterpreted to be about AGI when they’re clearly not, like the famous METR time horizon graph.[2] I imagine that a non-trivial amount of EA misjudgment around AGI results from a failure to correctly read and interpret graphs.
And, of course, when people like titotal examine the math behind some of these graphs, like those in AI 2027, they are sometimes found to be riddled with major mistakes.
What I said elsewhere about AGI discourse in general is true about graphs in particular: the scientifically defensible claims are generally quite narrow, caveated, and conservative. The claims that are broad, unqualified, and bold are generally not scientifically defensible. People at METR themselves caveat the time horizons graph and note its narrow scope (I cited examples of this elsewhere in the comments on this post). Conversely, graphs that attempt to make a broad, unqualified, bold claim about AGI tend to be complete nonsense.
Out of curiosity, roughly what probability would you assign to there being an AI financial bubble that pops sometime within the next five years or so? If there is an AI bubble and if it popped, how would that affect your beliefs around near-term AGI?
How is correctness physically instantiated in space and time and how does it physically cause physical events in the world, such as speaking, writing, brain activity, and so on? Is this an important question to ask in this context? Do we need to get into this?
You can take an epistemic practice in EA such as "thinking that Leopold Aschenbrenner's graphs are correct" and ask about the historical origin of that practice without making a judgement about whether the practice is good or bad, right or wrong. You can ask the question in a form like, "How did people in EA come to accept graphs like those in 'Situational Awareness' as evidence?" If you want to frame it positively, you could ask the question as something like, "How did people in EA learn to accept graphs like these as evidence?" If you want to frame it negatively, you could ask, "How did people in EA not learn not to accept graphs like these as evidence?" And of course you can frame it neutrally.
The historical explanation is a separate question from the evaluation of correctness/incorrectness and the two don't conflict with each other. By analogy, you can ask, "How did Laverne come to believe in evolution?" And you could answer, "Because it's the correct view," which would be right, in a sense, if a bit obtuse, or you could answer, "Because she learned about evolution in her biology classes in high school and college", which would also be right, and which would more directly answer the question. So, a historical explanation does not necessarily imply that a view is wrong. Maybe in some contexts it insinuates it, but both kinds of answers can be true.
But this whole diversion has been unnecessary.
Do you know a source that formally makes the argument that the METR graph is about AGI? I am trying to pin down the series of logical steps that people are using to get from that graph to AGI. I would like to spell out why I think this inference is wrong, but first it would be helpful to see someone spell out the inference they’re making.
Upvoted because I think this is interesting historical/intellectual context, but I think you might have misunderstood what I was trying to say in the comment you replied to. (I joined Giving What We Can in 2009 and got heavily involved in my university EA group from 2015-2018, so I’m aware that AI has been a big topic in AI for a very long time, but I’ve never had any involvement with Oxford University or had any personal connections with Toby Ord or Will MacAskill, besides a few passing online interactions.)
In my comment above, I wasn’t saying that EA’s interpenetration with LessWrong is largely to blame for the level of importance that the ideas of near-term AGI and AGI risk currently have in EA. (I also think that is largely true, but that wasn’t the point of my previous comment.) I was saying that the influence of LessWrong and EA’s embrace of the LessWrong subculture is largely to blame for the EA community accepting ridiculous stuff like "Situational Awareness", AI 2027, and so on, despite it having glaring flaws.
Focus on AGI risk at the current level EA gives it could be rational, or it might not be. What is definitely true is that the EA community accepts a lot of completely irrational stuff related to AGI risk. LessWrong doesn’t believe in academia, institutional science, academic philosophy, journalism, scientific skepticism, common sense, and so on. LessWrong believes in Eliezer Yudkowsky, the Sequences, and LessWrong. So, members of the LessWrong community go completely off the rails and create or join cults at seemingly a much, much higher rate than the general population. Because they’ve been coached to reject the foundations of sanity that most people have, and to put their trust and belief in this small, fringe community.
The EA community is not nearly as bad as LessWrong. If I thought it was as bad, I wouldn’t bother trying to convince anyone in EA of anything, because I would think they were beyond rational persuasion. But EA has been infected to a very significant degree by the LessWrong irrationality. I think the level of emphasis that EA puts on subjective guesses as a source of truth and an accompanying sort of lazy, incurious approach to inquiry (why look stuff up or attempt to create a rigorous, defensible thesis when you can just guess stuff?) is one example of the LessWrong influence. Eliezer Yudkowsky quite literally, explicitly believes that his subjective guesses are a better guide to truth than the approach of traditional, mainstream scientific institutions and communities. Yudkowsky has attempted to teach his approach to subjectively guessing things to the LessWrong community (and enjoyers of Harry Potter fanfic). That approach has leeched into the EA community.
The result is you can have things like "Situational Awareness" and AI 2027 where the "data" is made-up and just consists of some random people’s random subjective guesses. This is the kind of stuff that should never be taken even a little bit seriously.
If you want to know which approach produces better results, look at the achievements of academic science — which underlie basically the entire modern world — versus the achievements of the LessWrong community — some Harry Potter fanfic and about half a dozen cults, despite believing their approach is unambiguously superior. If you adjust for time and population, the comparison still comes out favourably for science versus Yudkowskian subjective guessology. How many towns of under 5,000 people create even a single cult within even the span of 50 years? Versus the LessWrong community creating multiple cults within 16 years of its existence.
I could be totally wrong in my root cause analysis. EA may have developed these bad habits independently of LessWrong. In any case, I think it’s clear that these are bad habits, that they lead nowhere good, and that EA should clear house (i.e. stop believing in subjective guess-based or otherwise super low-quality argumentative writing) and raise the bar for the quality of arguments and evidence that are taken seriously to something a bit closer to the academic or scientific level.
I don’t have an idyllic view of academia. I don’t think it’s all black-and-white. I recently re-read a review of Colin McGinn’s ridiculous book on the philosophy of physics. On one hand, the descriptions of and quotes from the book reminded me of all the stuff that drives me crazy in academic philosophy. On the other hand, the reviewer is a philosopher and her review is published in a philosophy journal. So, there’s a push and pull.
Maybe a good analogy for academia is liberal democracy. It’s often a huge mess, full of ongoing conflicts and struggles, frequently unjust and unreasonable, but ultimately it produces an astronomical amount of value, rivalling the best of anything humans have ever done. By vouching for academia or liberal democracy, I’m not saying it’s all good, I’m just saying that the overall process is good. And the process itself (in both cases) can surely be improved, but through reform and evolution involving a lot of people with expertise, not by a charismatic outsider with a zealous following (e.g. illiberal/authoritarian strongmen, in the case of government, or someone like Yudkowsky, in the case of academia, who, incidentally, has a bit of an authoritarian attitude, not politically, but intellectually).
Can you say more about what makes something "a subjective guess" for you? When you say well under 0.05% chance of AGI in 10 years, is that a subjective guess?
Like, suppose I am asked, as a pro-forecaster, to say whether the US will invade Syria, after a US military build-up involving air craft carriers in the Eastern Med, and I look for newspaper reports of signs of this, look up the base rate of how often the US bluffs with a military build up rather than invading, and then make a guess as to how likely an invasion is, is that "a subjective guess". Or am I relying on data? What about if I am doing what AI 2027 did and trying to predict when LLMs match human coding ability on the basis of current data. Suppose I use the METR data like they did, and I do the following. I assume that if AIs are genuinely able to complete 90% of real world tasks that take human coders 6 months, then they are likely as good at coding as humans. I project the METR data out to find a date for when we will hit 6-months tasks, theoretically if the trend continues. But then, instead of stopping, and saying that is my forecast, I remember that benchmark performance is generally a bit misleading in terms of real-world competence, and remember METR found that AIs often couldn't complete more realistic versions of the tasks which the benchmark counted them as passing. (Couldn't find a source for this claim, but I remember seeing it somewhere.) I decide maybe when models will hit real world 6-month task 90% completion rate should maybe be a couple more doubling times of the 90 time-horizon METR metric forward. I move my forecast for human-level coders to, say, 15 months after the original to reflect this. Am I making a subjective guess, or relying on data? When I made the adjustment to reflect issues about construct validity, did that make my forecast more subjective? If so, did it make it worse, or did it make it better? I would say better, and I think you'd probably agree, even if you still think the forecast is bad.
This geopolitical example here is not particularly hypothetical. I genuinely get paid to do this for Good Judgment, and not ONLY by EA orgs, although often it is by them. We don't know who the clients are, but some questions have been clearly commercial in nature and of zero EA interest.
I'm not particular offended* if you think this kind of "get allegedly expert forecasters, rather than or as well as domain experts to predict stuff" is nonsense. I do it because people pay me and it's great fun, rather than because I have seriously investigated it's value. But what I do disagree with the idea that this is distinctively a Less Wrong rationalist thing. There's a whole history of relatively well-known work on it by the American political scientists Philip Tetlock that I think began when Yudkowsky was literally still a child. It's out of that work that Good Judgment, that org for which I work as a forecaster comes, not anything to do with Less Wrong. It's true that LessWrong rationalists are often enthusiastic about it, but that's not all that interesting on its own. (In general many Yudkowskian ideas actually seem derived from quite mainstream sources on rationality and decision-making to me. I would not reject them just because you don't like what LW does with them. Bayesian epistemology is a real research program in philosophy for example.)
*Or at least, I am trying my best not to be offended, because I shouldn't be, but of course I am human and objectivity about something I derive status and employment from is hard. Though I did have a cool conversation at the least EAG London with a very good forecaster who thought it was terrible Open Phil put money into forecasting because it just wasn't very useful or important.
What does the research literature say about the accuracy of short-term (e.g. 1-year timescales) geopolitical forecasting?
And what does the research literature say about the accuracy of long-term (e.g. longer than 5-year timescales) forecasting about technological progress?
(Should you even bother to check the literature to find out, or should you just guess how accurate you think each one probably is and leave it at that?)
Of course. And I'll add that I think such guesses, including my own, have very little meaning or value. It may even be worse to make them than to not make them at all.
This seems like a huge understatement. My impression is that the construct validity and criterion validity of the benchmarks METR uses, i.e. how much benchmark performance translates into real world performance, is much worse than you describe.
I think it would be closer to the truth to say if you're trying to predict when AI systems will replace human coders, the benchmarks are meaningless and should be completely ignored. I'm not saying that's the absolute truth, just that's it's closer to the truth than saying benchmark performance is "generally a bit misleading in terms of real-world competence".
Probably there's some loose correlation between benchmark performance and real-world competence, but it's not nearly one-to-one.
Definitely making a subjective guess. For example, what if performance on benchmarks simply never generalizes to real world performance? Never, ever, ever, not in a million years never?
By analogy, what level of performance on go would AlphaGo need to achieve before you would guess it would be capable of baking a delicious croissant? Maybe these systems just can't do what you're expecting them to do. And a chart can't tell you whether that's true or not.
AI 2027 admits the role that gut intuition plays in their forecast. For example:
An example intuition:
Okay, and what if it is hard? What if this kind of generalization is beyond the capabilities of current deep learning/deep RL systems? What if takes 20+ years of research to figure out? Then the whole forecast is out the window.
What's the reward signal for vague tasks? This touches on open research problems that have existed in deep RL for many years. Why is this going to be fully solved within the next 2-4 years? Because "intuitively, it feels like" it will be?
Another example is online learning, which is a form of continual learning. AI 2027 highlights this capability:
But I can't find anywhere else in any of the AI 2027 materials where they discuss online learning or continual learning. Are they thinking that online learning will not be one of the capabilities humans will have to invent? That AI will be able to invent online learning without first needing online learning to be able to invent such things? What does the scenario actually assume about online learning? Is it important or not? Is it necessary or unnecessary? And will it be something humans invent or AI invents?
When I tried to find what the AI 2027 authors have said about this, I found an 80,000 Hours Podcast interview where Daniel Kokotajlo said a few things about online learning, such as the following:
The other things Kokotajlo says in the interview about online learning and data efficiency are equally hazy and hand-wavy. It just comes down to his personal gut intuition. In the part I just quoted, he says maybe these fundamental research breakthroughs will happen in 2030-2035, but what if it's more like 2070-2075, or 2130-2135? How would one come to know such a thing?
What historical precedent or scientific evidence do we have to support the idea that anyone can predict, with any accuracy, the time when new basic science will be discovered? As far as I know, this is not possible. So, what's the point of AI 2027? Why did the authors write it and why did anyone other than the authors take it seriously?
nostalgebraist originally made this critique here, very eloquently.
It can easily be true that Yudkowsky’s ideas about things are loosely derived from or inspired by ideas that make sense and that Yudkowsky’s don’t make a lick of sense themselves. I don’t think most self-identified Bayesians outside of the LessWrong community would agree with Yudkowsky’s rejection of institutional science, for instance. Yudkowsky’s irrationality says nothing about whether (the mainstream version of) Bayesianism is a good idea or not; whether (the mainstream version of) Bayesianism, or other ideas Yudkowsky draws from, are a good idea or not says nothing about whether Yudkowsky’s ideas are irrational.
By analogy, pseudoscience and crackpot physics are often loosely derived from or inspired by ideas in mainstream science. The correctness of mainstream science doesn't imply the correctness of pseudoscience or crackpot physics. Conversely, the incorrectness of pseudoscience or crackpot physics doesn't imply the incorrectness of mainstream science. It wouldn't be a defense of a crackpot physics theory that it's inspired by legitimate physics, and the legitimacy of the ideas Yudkowsky is drawing from isn't a defense of Yudkowsky's bizarre views.
I think forecasting is perfectly fine within the limitations that the scientific research literature on forecasting outlines. I think Yudkowsky’s personal twist on Aristotelian science or subjectively guessing which scientific propositions are true or false and then assuming he’s right (without seeking empirical evidence) because he thinks he has some kind of nearly superhuman intelligence — I think that’s absurd and that’s obviously not what people like Philip Tetlock have been advocating.
I'm not actually that interested in defending:
Rather what I took myself to be saying was:
Now, it may be that forecasting is useless here, because no one can predict how technology will develop five years out. But I'm pretty comfortable saying that if THAT is your view, then you really shouldn't also be super-confident the chance of near-term AGI is low. Though I do think saying "this just can't be forecasted reliably" on its own is consistent with criticizing people who are confident AGI is near.
Strong upvoted. Thank you for clarifying your views. That’s helpful. We might be getting somewhere.
With regard to AI 2027, I get the impression that a lot of people in EA and in the wider world were not initially aware that AI 2027 was an exercise in judgmental forecasting. The AI 2027 authors did not sufficiently foreground this in the presentation of their "results". I would guess there are still a lot of people in EA and outside it who think AI 2027 is something more rigorous, empirical, quantitative, and/or scientific than a judgmental forecasting exercise.
I think this was a case of some people in EA being fooled or tricked (even if that was not the authors’ intention). They didn’t evaluate the evidence they were looking at properly. You were quick to agree with my characterization of AI 2027 as a forecast based on subjective intuitions. However, in one previous instance on the EA Forum, I also cited nostalgebraist’s eloquent post and made essentially the same argument I just made, and someone strongly disagreed. So, I think people are just getting fooled, thinking that evidence exists that really doesn’t.
What does the forecasting literature say about long-term technology forecasting? I’ve only looked into it a little bit, but generally technology forecasting seems really inaccurate, and the questions forecasters/experts are being asked in those studies seem way easier than forecasting something like AGI. So, I’m not sure there is a credible scientific basis for the idea of AGI forecasting.
I have been saying from the beginning and I’ll say once again that my forecast of the probability and timeline of AGI is just a subjective guess and there’s a high level of irreducible uncertainty here. I wish that people would stop talking so much about forecasting and their subjective guesses. This eats up an inordinate portion of the conversation, despite its low epistemic value and credibility. For months, I have been trying to steer the conversation away from forecasting toward object-level technical issues.
Initially, I didn’t want to give any probability, timeline, or forecast, but I realized the only way to be part of the conversation in EA is to "play the game" and say a number. I had hoped that would only be the beginning of the conversation, not the entire focus of the conversation forever.
You can’t squeeze Bayesian blood from a stone of uncertainty. You can’t know what you can’t know by an act of sheer will. Most discussion of AGI forecasting is wasted effort. Most of it is mostly pointless.
What is not pointless is understanding the object-level technical issues better. If anything helps with AGI forecasting accuracy (and that’s a big "if"), this will. But it also has other important advantages, such as:
And more topics besides these.
I would consider it a worthy contribution to the discourse to play some small part in raising the overall knowledge level of people in EA about the object-level technical issues relevant to the AI frontier and to AGI. Based on track records, technology forecasting may be mostly forlorn, but, based on track records, science certainly isn’t forlorn. Focusing on the science of AI rather than on an Aristotelian approach would be a beautiful return to Enlightenment values, away from the anti-scientific/anti-Enlightenment thinking that pervades much of this discourse.
By the way, in case it’s not already clear, saying there is a high level of irreducible uncertainty does not support funding whatever AGI-related research program people in EA might currently feel inclined to fund. The number of possible ways the mind could work and the number of possible paths the future could take is large, perhaps astronomically large, perhaps infinite. To arbitrarily seize on one and say that’s the one, pour millions of dollars into that — that is not justifiable.
I think what you are saying here is mostly reasonable, even if I am not sure how much I agree: it seems to turn on very complicated issue in the philosophy of probability/decision theory, and what you should do when accurate prediction is hard, and exactly how bad predictions have to be to be valueless. Having said that, I don't think your going to succeed in steering conversation away from forecasts if you keep writing about how unlikely it is that AGI will arrive near term. Which you have done a lot, right?
I'm genuinely not sure how much EA funding for AI-related stuff even is wasted on your view. To a first approximation, EA is what Moskowitz and Tuna fund. When I look at Coefficient's-i.e. what previously was Open Phil's-7 most recent AI safety and governance grants here's what I find:
1) A joint project of METR and RAND to develop new ways of assessing AI systems for risky capabilities.
2) "AI safety workshop field building" by BlueDot Impact
3) An AI governance workshop at ICML
4) "General support" for the Center for Governance of AI.
5) A "study on encoded reasoning in LLMs at the University of Maryland"
6) "Research on misalignment" here: https://www.meridiancambridge.org/labs
7) "Secure Enclaves for LLM Evaluation" here https://openmined.org/
So is this stuff bad or good on the worldview you've just described? I have no idea, basically. None of it is forecasting, plausibly it all broadly falls under either empirical research on current and very near future models, training new researchers, or governance stuff, though that depends on what "research on misalignment" means. But of course, you'd only endorse if it is good research. If you are worried about lack of academic credibility specifically, as far as I can tell 7 out of the 20 most recent grants are to academic research in universities. It does seem pretty obvious to me that significant ML research goes on at places other than universities, though, not least the frontier labs themselves.
I don’t really know all the specifics of all the different projects and grants, but my general impression is that very little (if any) of the current funding makes sense or can be justified if the goal is to do something useful about AGI (as opposed to, say, make sure Claude doesn’t give risky medical advice). Absent concerns about AGI, I don’t know if Coefficient Giving would be funding any of this stuff.
To make it a bit concrete, there at least five different proposed pathways to AGI, and I imagine the research Coefficient Giving is only relevant to one of the five pathways, if it’s even relevant to that one. But the number five is arbitrary here. The actual decision-relevant number might be a hundred, or a thousand, or a million, or infinity. It just doesn’t feel meaningful or practical to try to map out the full space of possible theories of how the mind works and apply the precautionary principle against the whole possibility space. Why not just do science instead?
By word count, I think I’ve written significantly more about object-level technical issues relevant to AGI than directly about AGI forecasts or my subjective guesses of timelines or probabilities. The object-level technical issues are what I’ve tried to emphasize. Unfortunately, commenters seem fixated on surveys, forecasts, and bets, and don’t seem to be as interested in the object-level technical topics. I keep trying to steer the conversation in a technical direction. But people keep wanting to steer it back toward forecasting, subjective guesses, and bets.
For example, I wrote a 2,000-word post called "Unsolved research problems on the road to AGI". There are two top-level comments. The one with the most karma proposes a bet.
My post "Frozen skills aren’t general intelligence" mainly focuses on object-level technical issues, including some of the research problems discussed in the other post. You have the top comment on that post (besides SummaryBot) and your comment is about a forecasting survey.
People on the EA Forum are apparently just really into surveys, bets, and forecasts.
The forum is kind of a bit dead generally, for one thing.
I don't really get on what grounds your are saying that the Coefficient Grants are not to people to do science, apart from the governance ones. I also think you are switching back and forth between: "No one knows when AGI will arrive, best way to prepare just in case is more normal AI science" and "we know that AGI is far, so there's no point doing normal science to prepare against AGI now, although there might be other reasons to do normal science."
If we don’t know which of infinite or astronomically many possible theories about AGI are more likely to be correct than the others, how can we prepare?
Maybe alignment techniques conceived based on our current wrong theory make otherwise benevolent and safe AGIs murderous and evil on the correct theory. Or maybe they’re just inapplicable. Who knows?
Not everything being funded here even IS alignment techniques, but also, insofar as you just want general better understanding of AI as a domain through science, why wouldn't you learn useful stuff from applying techniques to current models. If the claim is that current models are too different from any possible AGI for this info to be useful, why do you think "do science" would help prepare for AGI at all? Assuming you do think that, which still seems unclear to me.
You might learn useful stuff about current models from research on current models, but not necessarily anything useful about AGI (except maybe in the slightest, most indirect way). For example, I don't know if anyone thinks if we had invested 100x or 1,000x more into research on symbolic AI systems 30 years ago, that we would know meaningfully more about AGI today. So, as you anticipated, the relevance of this research to AGI depends on an assumption about the similar between a hypothetical future AGI and current models.
However, even if you think AGI will be similar to current models, or it might be similar, there might be no cost to delaying research related to alignment, safety, control, preparedness, value lock-in, governance, and so on until more fundamental research progress on capabilities has been made. If in five or ten or fifteen years or whatever we understand much better how AGI will be built, then a single $1 million grant to a few researchers might produce more useful knowledge about alignment, safety, etc. than Dustin Moskovitz's entire net worth would produce today if it were spend on research into the same topics.
My argument about "doing basic science" vs. "mitigating existential risk" is that these collapse into the same thing unless you make very specific assumptions about which theory of AGI is correct. I don't think those assumptions are justifiable.
Put it this way: let's say we are concerned that, for reasons due to fundamental physics, the universe might spontaneously end. But we also suspect that, if this is true, there may be something we can do to prevent it. What we want to know is a) if the universe is in danger in the first place, b) if so, how soon, and c) if so, what we can do about it.
To know any of these three things, (a), (b), or (c), we need to know which fundamental theory of physics is correct, and what the fundamental physical properties of our universe are. Problem is, there are half a dozen competing versions of string theory, and within those versions, the number of possible variations that could describe our universe is astronomically large, 10^500, or 10^272,000, or possibly even infinite. We don't know which variation correctly describes our universe.
Plus, a lot of physicists say string theory is a poorly conceived theory in the first place. Some offer competing theories. Some say we just don't know yet. There's no consensus. Everybody disagrees.
What does the "existential risk" framing get us? What action does it recommend? How does the precautionary principle apply? Let's say you have a $10 billion budget. How do you spend it to mitigate existential risk?
I don't see how this doesn't just loop all the way back around to basic science. Whether there's an existential risk, and if so, when we need to worry about it, and if when the time comes, what we can do about it, are all things we can only know if we figure out the basic science. How do we figure out the basic science? By doing the basic science. So, your $10 billion budget will just go to funding basic science, the same physics research that is getting funded anyway.
The space of possible theories about how the mind works is at least six, plus a lot of people saying we just don't know yet, and there are probably silly but illustrative ways to formulate it where you get very large numbers.
For instance, if we think the correct theory can be summed up in just 100 bits of information, then the number of possible theories is 10,000.
Or we could imagine what would happen if we paid a very large number of experts from various relevant fields (e.g. philosophy, cognitive science, AI) a lot of money to spend a year coming up with a one-to-two-page description of as many original, distinct, even somewhat plausible or credible theories as they could think of. Then we group together all the submissions that were similar enough and counted them as the same theory. How many distinct theories would we end up with? A handful? Dozens? Hundreds? Thousands?
I'm aware these thought experiments are ridiculous, but I'm trying to emphasize the point that the space of possible ideas seems very large. At the frontier of knowledge in a domain like the science of the mind, which largely exists in a pre-scientific or protoscientific or pre-paradigmatic state, trying to actually map out the space of theories that might possibly correct is a daunting task. Doing that well, to a meaningful extent, ultimately amounts to actually doing the science or advancing the frontier of knowledge yourself.
What is the right way to apply the precautionary principle in this situation? I would say the precautionary principle isn't the right way to think about it. We would like to be precautionary, but we don't know enough to know how to be. We're in a situation of fundamental, wide-open uncertainty, at the frontier of knowledge, in a largely pre-scientific state of understanding about the nature of the mind and intelligence. So, we don't know how to reduce risk — for example, our ideas on how to reduce risk might do nothing or they might increase risk.
I think the question is not 'whether there should be EA efforts and attention devoted to AGI' but whether the scale of these efforts is justified - and more importantly, how do we reach that conclusion.
You say no one else is doing effective work on AI and AGI preparedness but what I see in the world suggests the opposite. Given the amount of attention and investment of individuals, states, universities, etc in AI, statistically there's bound to be a lot of excellent people working on it, and that number is set to increase as AI becomes even more mainstream.
Teenagers nowadays have bought in to the idea that 'AI is the future' and go to study it at university, because working in AI is fashionable and well-paid. This is regardless of whether they've had deep exposure to Effective Altruism. If EA wanted this issue to become mainstream, well... it has undoubtedly succeeded. I would tend to agree that at this moment in time, a redistribution of resources and 'hype' towards other issues would be justified and welcome (but again, crucially, we have to agree on a transparent method to decide whether this is the case, as OP I think importantly calls for).
What percentage chance would you put on an imminent alien invasion and what amount of resources would say is rational to allocate for defending against it? The astrophysicist Avi Loeb at Harvard is warning that there is a 30-40% chance interstellar alien probes have entered our solar system and pose a threat to human civilization. (This is discussed at length in the Hank Green video I linked and embedded in the post.)
It’s possible we should start investing in R&D now that can defend against advanced autonomous space-based technologies that might be used by a hostile alien intelligence. Even if you think there’s only a 1% chance of this happening, it justifies some investment.
As I see it, this isn’t happening — or just barely. Everything flows from the belief that AGI is imminent, or at least that’s there’s a very significant, very realistic chance (10%+) that it’s imminent, and whether that’s true or not is barely ever questioned.
Extraordinary claims require extraordinary evidence. Most of the evidence cited by the EA community is akin to pseudoscience — Leopold Aschenbrenner’s "Situational Awareness" fabricates graphs and data; AI 2027 is a complete fabrication and the authors openly admit that, but it’s not foregrounded enough in the presentation such that it’s misleading. Most stuff is just people reasoning philosophically based on hunches. (And in these philosophical discussions, people in EA even invent their own philosophical terms like "epistemics" and "truth-seeking" that have no agreed-upon definition that anyone has written down — and which don’t come from academic philosophy or any other academic field.)
It’s very easy to make science come to whatever conclusion you want when you can make up data or treat personal subjective guesses plucked from thin air as data. Very little EA "research" on AGI would clear the standards for publication in a reputable peer-reviewed journal, and the research that would clear that standard (and has occasionally passed it, in fact) tends to make much more narrow, conservative, caveated claims than the beliefs that people in EA actually hold and act on. The claims that people in EA are using to guide the movement are not scientifically defensible. If they were, they would be publishable.
There is a long history of an anti-scientific undercurrent in the EA community. People in EA are frequently disdainful of scientific expertise. Eliezer Yudkowsky and Nate Soares seem to call for the rejection of the "whole field of science" in their new book, which is a theme Yudkowsky has promoted in his writing for many years.
The overarching theme of my critique is that the EA approach to the near-term AGI question is unscientific and anti-scientific. The treatment of the question of whether it’s possible, realistic, or likely in the first place is unscientific and anti-scientific. It isn’t an easy out to invoke the precautionary principle because the discourse/"research" on what to do to prepare for AGI is also unscientific and anti-scientific. In some cases, it seems incoherent.
If AGI will require new technology and new science, and we don’t yet know what that technology or science is, then it’s highly suspect to claim that we can do something meaningful to prepare for AGI now. Preparation depends on specific assumptions about the unknown science and technology that can’t be predicted in advance. The number of possibilities is far too large to prepare for them all, and most of them we probably can’t even imagine.
Your picture of EA work on AGI preparation is inaccurate to the extent I don't think you made a serious effort to understand the space you're criticizing. Most of the work looks like METR benchmarking, model card/RSP policy (companies should test new models for dangerous capabilities a propose mitigations/make safety cases), mech interp, compute monitoring/export controls research, and trying to test for undesirable behavior in current models.
Other people do make forecasts that rely on philosophical priors, but those forecasts are extrapolating and responding to the evidence being generated. You're welcome to argue that their priors are wrong or that they're overconfident, but comparing this to preparing for an alien invasion based on Oumuamua is bad faith. We understand the physics of space travel well enough to confidently put a very low prior on alien invasion. One thing basically everyone in the AI debate agrees on is that we do not understand where the limits of progress are as data reflecting continued progress continues to flow.
Your accusation of bad faith is incorrect. You shouldn’t be so quick to throw the term "bad faith" around (it means something specific and serious, involving deception or dishonesty) just because you disagree with something — that’s a bad habit that closes you off to different perspectives.
I think it’s an entirely apt analogy. We do not have an argument from the laws of physics that shows Avi Loeb is wrong about the possible imminent threat from aliens, or the probability of it. The most convincing argument against Loeb’s conclusions is about the epistemology of science. That same argument applies, mutatis mutandis, to near-term AGI discourse.
With the work you mentioned, there is often an ambiguity involved. To the extent it’s scientifically defensible, it’s mostly not about AGI. To the extent it’s about AGI, it’s mostly not scientifically defensible.
For example, the famous METR graph about the time horizons of tasks AI systems can complete
80%50% of the time is probably perfectly fine if you only take it for what it is, which is a fairly narrow, heavily caveated series of measurements of current AI systems on artificially simplified benchmark tasks. That’s scientifically defensible, but it’s not about AGI.When people outside of METR make an inference from this graph to conclusions about imminent AGI, that is not scientifically defensible. This is not a complaint about METR’s research — which is not directly about AGI (at least not in this case) — but about the interpretation of it by people outside of METR to draw conclusions the research does not support. That interpretation is just a hand-wavy philosophical argument, not a scientifically defensible piece of research.
Just to be clear, this is not a criticism of METR, but a criticism of people who misinterpret their work and ignore the caveats that people at METR themselves give.
I suppose it’s worth asking: what evidence, scientific or otherwise, would convince you that this all has been a mistake? That the belief in a significant probability of near-term AGI actually wasn’t well-supported after all?
I can give many possible answers to the opposite question, such as (weighted out of 5 in terms of how important they would be to me deciding that I was wrong):
"Any sort of significant credible evidence of a major increase in AI capabilities, such as LLMs being able to autonomously and independently come up with new correct ideas in science, technology, engineering, medicine, philosophy, economics, psychology, etc"
Just in the spirit of pinning people to concrete claims: would you count progress on Frontier Math 4, like say, models hitting 40%*, as being evidence that this is not so far off for mathematics specifically? (To be clear, I think it is very easy to imagine models that are doing genuinely significant research maths but still can't reliably be a personal assistant, so I am not saying this is strong evidence of near-term AGI or anything like that.) Frontier Math Tier 4 questions allegedly require some degree of "real" mathematical creativity and were designed by actual research mathematicians-including in some cases Terry Tao EDIT: that is he supplied some Frontier Math questions, not sure if any were Tier 4, so we're not talking cranks here. Epoch claim some of the problems can take experts weeks. If you wouldn't count this as evidence that genuine AI contributions to research mathematics might not be more than 6-7 years off, what, if anything would you count as evidence of that? If you don't like Frontier Math Tier 4 as an early warning sign is that because:
1) You think it's not really true that the problems require real creativity, and you don't think "uncreative" ways of solving them will ever get you to being able to do actual research mathematics that could get in good journals.
2) You just don't trust models not to be trained on the test set because there was a scandal about Open AI having access to the answers. (Though as I've said, current state of the art is a Google model).
3) 40% is too low, something like 90% would be needed for a real early warning sign.
4) In principle, this would be a good early warning sign if for all we knew RL scaling could continue for many more orders of magnitude, but since we know it can't continue for more than a few, it isn't because by the time your hitting a high level on Frontier Math 4, your hitting the limits of RL-scaling and can't improve further
Of course, maybe you think the metric is fine, but you just expect progress to stall well before scores are high enough to be an early warning sign of real mathematical creativity, because of limits to RL-scaling?
*Current best is some version of Gemini at 18%.
I wonder if you noticed that you changed the question? Did you not notice or did you change the question deliberately?
What I brought up as a potential form of important evidence for near-term AGI was:
You turned the question into:
Now, rather than asking me about the evidence I use to forecast near-term AGI, you’re asking me to forecast the arrival of the evidence I would use for forecasting near-term AGI? Why?
My thought process didn't go beyond "Yarrow seems committed to a very low chance of AI having real, creative research insights in the next few years, here is something that puts some pressure on that". Obviously I agree that when AGI will arrive is a different question from when models will have real insights in research mathematics. Nonetheless I got the feeling-maybe incorrectly, that your strength of conviction that AGI is partly based on things like "models in the current paradigm can't have 'real insight'", so it seemed relevant, even though "real insight in maths is probably coming soon, but AGI likely over 20 years away" is perfectly coherent, and indeed close to my own view.
Anyway, why can't you just answer my question?
I have no idea when AI systems will be able to do math research and generate original, creative ideas autonomously, but it will certainly be very interesting if/when they do.
It seems like there’s not much of a connection between the FrontierMath benchmark and this, though. LLMs have been scoring well on question-and-answer benchmarks in multiple domains for years and haven’t produced any original, correct ideas yet, as far as I’m aware. So, why would this be different?
LLMs have been scoring above 100 on IQ tests for years and yet can’t do most of the things humans who score above 100 on IQ tests can do. If an LLM does well on math problems that are hard for mathematicians or math grad students or whatever, that doesn’t necessarily imply it will be able to do the other things, even within the domain of math, that mathematicians or math grad students do.
We have good evidence for this because LLMs as far back as GPT-4 nearly 3 years ago have done well on a bunch of written tests. Despite there being probably over 1 billion regular users of LLMs and trillions of queries put to LLMs, there’s no indication I’m aware of an LLM coming up with a novel, correct idea of any note in any academic or technical field. Is there a reason to think performance on the FrontierMath benchmark would be different than the trend we’ve already seen with other benchmarks over the last few years?
The FrontierMath problems may indeed require creativity from humans to solve them, but that doesn’t necessarily mean solving them is a sign of creativity from LLMs. By analogy, playing grandmaster-level chess may require creativity from humans, but not from computers.
This is related to an old idea in AI called Moravec’s paradox, which warns us not to assume what is hard for humans is hard for computers, or what is easy for humans is easy for computers.
I guess I feel like if being able to solve mathematical problems designed by research mathematicians to be similar to the kind of problems they solve in their actual work is not decent evidence that AIs are on track to be able to do original research in mathematics in less than say 8 years then what would you EVER accept as empirical evidence that we are on track for that, but not there yet?
Note that I am not saying this should push your overall confidence to over 50% or anything, just that it ought to move you up by a non-trivial amount relative to whatever your credence was before. I am certainly NOT saying that skill on Frontier Math 4 will inevitably transfer to real research mathematics, just that you should think there is a substantial risk that it will.
I am not persuaded by the analogy to IQ test scores for the following reason. It is far from clear that the tasks that LLMs can't do despite scoring 100 on IQ tests are anything like as similar as the Frontier Math 4 tasks are at least allegedly designed to resemble real research questions in mathematics*, because the latter are being deliberately designed for similarity, whereas IQ tests are just designed so that skill on them correlates with skill on intellectual tasks in general among humans. (I also think the inference towards "they will be able to DO research math", from progress on Frontier Math 4, is rather less shaky than "they will DO proper research math in the same way as humans". It's not clear to me what tasks actually require "real creativity" if that means a particular reasoning style, rather than just the production of novel insights as an end product. I don't think you or anyone else knows this either.) Real math is also uniquely suited to questions-and-answer benchmarks I think, because things really are often posed as extremely well-defined problems with determinate answers, i.e. prove X. Proving things is not literally the only skill mathematicians have, but being able to prove the right stuff is enough to be making a real contribution. In my view that makes claims for construct validity here much more plausible than say, inferring Chat-GTP can be a lawyer if it passes the bar exam.
In general, your argument here seems like it could be deployed against literally any empirical evidence that AIs were approaching being able to do a task, short of them actually performing that task. You can always say "just because in humans, ability to do X is correlated with ability to do Y, doesn't mean the techniques the models are using to do X can do Y with a bit of improvement." And yes, that is always true, that it doesn't *automatically* mean that. But if you allow this to mean that no success on any task ever significantly moves you at all about future real world progress on intuitively similar but harder tasks, you are basically saying it is impossible to get empirical evidence that progress is coming before it has arrived, which is just pretty suspicious a priori. What you should do in my view, is think carefully about the construct validity of the particular benchmark in question, and then-roughly-updated your view based on how likely you think it is to be basically valid, and what it would mean if it was. You should take into account the risk that success on Frontier Math 4 is giving real signal, not just the risk that it is meaningless.
My personal guess is that it is somewhat meaningful, and we will see the first real AI contributions to maths in 6-7 years, that is 60% chance by then of AI proofs important enough for credible mid-ranking journals. To be clear, I say "somewhat" because this is several years after I expect the benchmark itself to saturate. But I am not shocked if someone thinks "no, it is more likely to be meaningless". But I do think if your going to make a strong version of the "it's meaningless" case where you don't see the results as signal to any non-negligible degree, you need more than to just say "some other benchmarks in far less formal demains, apparently far less similar to the real world tasks being measured, have low construct validity."
In your view, is it possible to design a benchmark that a) does not literally amount to "produce a novel important proof", but b) nonetheless improvements on the benchmark give decent evidence that we are moving towards models being able to do this? If it is possible, how would it differ from Frontier Math 4?
*I am prepared to change my mind on this if a bunch of mathematicians say "no, actually the questions don't look like they were optimized for this."
I am not breaking new ground by saying it would be far more interesting to see an AI system behave like a playful, curious toddler or a playful, curious cat than a mathematician. That would be a sign of fundamental, paradigm-shifting capabilities improvement and would make me think maybe AGI is coming soon.
I agree that IQ tests were designed for humans, not machines, and that’s a reason to think it’s a poor test for machines, but what about all the other tests that were designed for machines? GPT-4 scored quite high on a number of LLM benchmarks in March 2023. Has enough time passed that we can say LLM benchmark performance doesn’t meaningfully translate into real world capabilities? Or do we have to reserve judgment for some number of years still?
If your argument is that math as a domain is uniquely well-suited to the talents of LLMs, that could be true. I don’t know. Maybe LLMs will become an amazing AI tool for math, similar to AlphaFold for protein structure prediction. That would certainly be interesting, and would be exciting progress for AI.
I would say this argument is highly irreducibly uncertain and approaches the level of uncertainty of something like guessing whether the fundamental structure of physical reality matches the fundamental mathematical structure of string theory. I’m not sure it’s meaningful to assign probabilities to that.
It also doesn’t seem like it would be particularly consequential outside of mathematics, or outside of things that mathematical research directly affects. If benchmark performance in other domains doesn’t generalize to research, but benchmark performance in math does generalize to math research, well, then, that affects math research and only math research. Which is really interesting, but would be a breakthrough akin to AlphaFold — consequential for one domain and not others.
You said that my argument against accepting FrontierMath performance as evidence for AIs soon being able to perform original math research is overly general, such that a similar argument could be used against any evidence of progress. But what you said about that is overly general and similar reasoning could be used against any argument about not accepting a certain piece of evidence about current AI capabilities to support a certain conclusion about AI capabilities forecasting.
I suppose looking at the general contours of arguments from 30,000 feet in the air rather than their specifics and worrying “what if” is not particularly useful.
I guess I still just want to ask: If models hit 80% on frontier math by like June 2027, how much does that change your opinion on whether models will be capable of "genuine creativity" in at least one domain by 2033. I'm not asking for an exact figure, just a ballpark guess. If the answer is "hardly at all", is there anything short of an 100% clear example of a novel publishable research insight in some domain, that would change your opinion on when "real creativity" will arrive?
What I just said: AI systems acting like a toddler or a cat would make me think AGI might be developed soon.
I’m not sure FrontierMath is any more meaningful than any other benchmark, including those on which LLMs have already gotten high scores. But I don’t know.
I asked about genuine research creativity not AGI, but I don't think this conversation is going anywhere at this point. It seems obvious to me that "does stuff mathematicians say makes up the building blocks of real research" is meaningful evidence that the chance that models will do research level maths in the near future is not ultra-low, given that capabilities do increase with time. I don't think this analogous to IQ tests or the bar exam, and for other benchmarks, I would really need to see what your claiming is the equivalent of the transfer from frontier math 4 to real math that was intuitive but failed.
What percentage probability would you assign to your ability to accurately forecast this particular question?
I'm not sure why you're interested in getting me to forecast this. I haven't ever made any forecasts about AI systems' ability to do math research. I haven't made any statements about AI systems' current math capabilities. I haven't said that evidence of AI systems' ability to do math research would affect how I think about AGI. So, what's the relevance? Does it have a deeper significance, or is it just a random tangent?
If there is a connection to the broader topic of AGI or AI capabilities, I already gave a bunch of examples of evidence I would consider to be relevant and that would change my mind. Math wasn't one of them. I would be happy to think of more examples as well.
I think a potentially good counterexample to your argument about FrontierMath → original math research is natural language processing → replacing human translators. Surely you would agree that LLMs have mastered the basic building blocks of translation? So, 2-3 years after GPT-4, why is demand for human translators still growing? One analysis claims that growth is counterfactually less that it would have been without the increase in the usage of machine translation, but demand is still growing.
I think this points to the difficulty in making these sorts of predictions. If back in 2015, someone had described to you the capabilities and benchmark performance of GPT-4 in 2023, as well as the rate of scaling of new models and progress on benchmarks, would you have thought that demand for human translators would continue to grow for at least the next 2-3 years?
I don't have any particular point other than what seems intuitively obvious in the realm of AI capabilities forecasting may in fact be false, and I am skeptical of hazy extrapolations.
My list is very similar to yours. I believe items 1, 2, 3, 4, and 5 have already been achieved to substantial degrees and we continue to see progress in the relevant areas on a quarterly basis. I don't know about the status of 6.
For clarity on item 1, AI company revenues in 2025 are on track to cover 2024 costs, so on a product basis, AI models are profitable; it's the cost of new models that pull annual figures into the red. I think this will stop being true soon, but that's my speculation, not evidence, so I remain open that scaling will continue to make progress towards AGI, potentially soon.
Do you stand by your accusation of bad faith?
Your accusation of bad faith seems to rest on your view that the restraints imposed by the laws of physics on space travel make an alien invasion or attack extremely improbable. Such an event may indeed be extremely improbable, but the laws of physics do not say so.
I have to imagine that you are referring to the speeds of spacecraft and the distances involved. The Milky Way Galaxy is 100,000 light-years in diameter organized along a plane in a disc shape that is 1,000 light-years thick. NASA’s Parker Space Probe has travelled at 0.064% the speed of light. Let’s round it to 0.05% of the speed of light for simplicity. At 0.05% the speed of light, the Parker Space Probe could travel between the two farthest points in the Milky Way Galaxy in 200 million years.
That means that if the maximum speed of spacecraft in the galaxy were limited to only the top speed of NASA’s fastest space probe today, an alien civilization that reached an advanced stage of science and technology — perhaps including things like AGI, advanced nanotechnology/atomically precise manufacturing, cheap nuclear fusion, interstellar spaceships, and so on — more than 200 million years ago would have had plenty of time to establish a presence in every star system of the Milky Way. At 1% the speed of light, the window of time shrinks to 10 million years, and so on.
Designs for spacecraft that credible scientists and engineers thought Earth could actually build in the near future include a light sail-based probe that would supposedly travel at 15-20% the speed of light. Such a probe could traverse the diameter of the Milky Way in under 1 million years at top speed. Acceleration and deceleration complicate the picture somewhat, but the fundamental idea still holds.
If there are alien civilizations in our galaxy, we don’t have any clear, compelling scientific reason to think they wouldn’t be many millions of years older than our civilization. The Earth formed 4.5 billion years ago, so if a habitable planet elsewhere in the galaxy formed just 10% sooner and put life on that planet on the same trajectory as on ours, the aliens would be 450 million years ahead of us. Plenty of time to reach everywhere in the galaxy.
The Fermi paradox has been considered and discussed by people working in physics, astronomy, rocket/spacecraft engineering, SETI, and related fields for decades. There is no consensus on the correct resolution to the paradox. Certainly, there is no consensus that the laws of physics resolve it.
So, if I’m understanding your reasoning correctly — that surely I must be behaving in a dishonest or deceitful way, i.e. engaging in bad faith, because obviously everyone knows the restraints imposed by the laws of physics on space travel make an alien attack on Earth extremely improbable — then your accusation of bad faith seems to rest on a mistake.
Thanks for giving me the opportunity to talk about this because the Fermi paradox is always so much fun to talk about.
It’s hard to know what "to substantial degrees" means. That sounds very subjective. Without the "to substantial degrees" caveat, it would be easy to prove that 1, 3, 4, and 5 have not been achieved, and fairly straightforward to make a strong case that 2 has not been achieved.
For example, it is simply a fact that Waymo vehicles have a human in the loop — Waymo openly says so — so Waymo has not achieved Level 4/5 autonomy without a human in the loop. Has Waymo achieved Level 4/5 autonomy without humans in the loop "to a substantial degree"? That seems subjective. I don’t know what "to a substantial degree" means to you, and it might mean something different to me, or to other people.
Humanoid robots have not achieved any profitable new applications in recent years, as far as I’m aware. Again, I don’t know what achieving this "to a substantial degree" might mean to you.
I would be curious to know what progress you think has been made recently on the fundamental research problems I mentioned, or what the closest examples are to LLMs engaging in the sort of creative intellectual act I described. I imagine the examples you have in mind are not something the majority of AI experts would agree fit the descriptions I gave.
Distinguish here between gold mining vs. selling picks and shovels. I’m talking about applications of LLMs and AI tools that are profitable for end users. Nvidia is extremely profitable because it sells GPUs to AI companies. In theory, in a hypothetical scenario, AI companies could become profitable by selling AI models as a service (e.g. API tokens, subscriptions) to businesses. But then would those business customers see any profit from the use of LLMs (or other AI tools)? That’s what I’m talking about. Nvidia is selling picks and shovels, and to some extent even the AI companies are selling picks and shovels. Where’s the gold?
The six-item list I gave was a list of some things that — each on their own but especially in combination — would go a long way toward convincing me that I’m wrong and my near-term AGI skepticism is a mistake. When you say your list is similar, I’m not quite sure what you mean. Do you mean that if those things didn’t happen, that would convince you that the probability or level of credence you assign to near-term AGI is way too high? I was trying to ask you what evidence would convince you that you’re wrong.
Thanks for the good post, Yarrow. I strongly upvoted it. I remain open to bets against short AI timelines, or what they supposedly imply, up to 10 k$. I mostly think about AI as normal technology.
Thanks, Vasco.
"AI as normal technology" is a catchy phrase, and could be a useful one, but I was so confused and surprised when I dug in deeper to what the "AI as a normal technology" view actually is, as described by the people who coined the term.
I think "normal technology" is a misnomer, because they seem to think some form of transformative AI or AGI will be created sometime over the next several decades, and in the meantime AI will have radical, disruptive economic effects.
They should come up with some other term for their view like "transformative AI slow takeoff" because "normal technology" just seems inaccurate.
Fair! What they mean is closer to "AI as a more normal technology than many predict". Somewhat relatedly, I liked the post Common Ground between AI 2027 & AI as Normal Technology.
1) Regardless of who is right about when AGI might be around (and bear in mind that we still have no proper definition for this), OP is right to call for more peer-reviewed scrutiny from people who are outsiders to both EA and AI.
This is just healthy, and regardless of whether this peer-reviewed reaches the same or different conclusions, NOT doing it automatically provokes legitimate fears that the EA movement is biased because so many of its members have personal (and financial) stakes in AI.
See this point of view by Shazeda Ahmed https://overthinkpodcast.com/episode-101-transcript She's an information scholar who has looked at AI and its links with EA and one of the critics of the lack of a counter-narrative.
I, for one, will tend to be skeptical of conclusions reached by a small pool of similar (demographically, economically, but also in the way they approach an issue) people as I will feel like there was a missed opportunity for true debate and different perspectives.
I take the point that these are technical discussions and therefore it makes it difficult to involve the general public into this debate, but not doing so creates the appearance (and often, more worryingly, the reality) of bias.
This can harm the EA movement as a whole (from my perspective it already does).
I'd love to see a more vocal and organised opposition that is empowered, respected, and funded to genuinely test assumptions.
2) Studying and devoting resources to preparing the world to AI technology doesn't seem like a bad idea given:
Low probability × massive stakes still justifies large resource allocation.
But, as OP seems to suggest it becomes an issue when that focus so prevalent that other just as important / likely issues are neglected because of that. It seems that EA's focus on rationality and 'counterfactuality' means that they should encourage people to work in fields that are truly neglected.
But can we really say that AI is still neglected given the massive outpourings of both private and public money into the sector? It is now very fashionable to work in AI, and a wide-spread belief is that doing so warrants a comfortable salary. Can we say the same thing about, say, the threat of nuclear annihilation, or biosecurity risk, or climate adaptation?
3) In response to the argument that 'even a false alarm would still produce valuable governance infrastructure' Yes, but at what cost? I don't see much discussion on whether all those resources would be better spent elsewhere.
Working on AI isn't the same as doing EA work on AI to reduce X-risk. Most people working in AI are just trying to make the AI more capable and reliable. There probably is a case for saying that "more reliable" is actually EA X-risk work in disguise, even if unintentionally, but it's definitely not obvious this is true.
Are you presupposing that good practical reasoning involves (i) trying to picture the most-likely future, and then (ii) doing what would be best in that event (while ignoring other credible possibilities, no matter their higher stakes)?
It would be interesting to read a post where someone tries to explicitly argue for a general principle of ignoring credible risks in order to slightly improve most-probable outcomes. Seems like such a principle would be pretty disastrous if applied universally (e.g. to aviation safety, nuclear safety, and all kinds of insurance), but maybe there's more to be said? But it's a bit frustrating to read takes where people just seem to presuppose some such anti-precautionary principle in the background.
To be clear: I take the decision-relevant background question here to not be the binary question Is AGI imminent? but rather something more degreed, like Is there a sufficient chance of imminent AGI to warrant precautionary measures? And I don't see how the AI bubble popping would imply that answering 'Yes' to the latter was in any way unreasonable. (A bit like how you can't say an election forecaster did a bad job just because their 40% candidate won rather than the one they gave a 60% chance to. Sometimes seeing the actual outcome seems to make people worse at evaluating others' forecasts.)
Some supporters of AI Safety may overestimate the imminence of AGI. It's not clear to me how much of a problem that is? (Many people overestimate risks from climate change. That seems important to correct if it leads them to, e.g., anti-natalism, or to misallocate their resources. But if it just leads them to pollute less, then it doesn't seem so bad, and I'd be inclined to worry more about climate change denialism. Similarly, I think, for AI risk.) There are a lot more people who persist in dismissing AI risk in a way that strikes me as outrageously reckless and unreasonable, and so that seems by far the more important epistemic error to guard against?
That said, I'd like to see more people with conflicting views about AGI imminence arrange public bets on the topic. (Better calibration efforts are welcome. I'm just very dubious of the OP's apparent assumption that losing such a bet ought to trigger deep "soul-searching". It's just not that easy to resolve deep disagreements about what priors / epistemic practices are reasonable.)
No, of course not.
I have written about this at length before, on multiple occasions (e.g. here and here, to give just two examples). I don’t expect everyone who reads one of my posts for the first time to know all that context and background — why would they? — but, also, the amount of context and background I have to re-explain every time I make a new post is already high because if I don’t, people will just raise the obvious objections I didn’t already anticipate and respond to in the post.
But, in, short: no.
I agree, but I didn’t say the AI bubble popping should settle the matter, only that I hoped it would motivate people to revisit the topic of near-term AGI with more open-mindedness and curiosity, and much less hostility toward people with dissenting opinions, given that there are already clear, strong objections — and some quite prominently made, as in the case of Toby Ord’s post on RL scaling — to the majority view of the EA community that seem to have mostly escaped serious consideration.
You don’t need an external economic event to see that the made-up graphs in "Situational Awareness" are ridiculous or that AI 2027 could not rationally convince anyone of anything who is not already bought-in to the idea of near-term AGI for other reasons not discussed in AI 2027. And so on. And if the EA community hasn’t noticed these glaring problems, what else hasn’t it noticed?
These are examples that anyone can (hopefully) easily understand with a few minutes of consideration. Anyone can click on one of the "Situational Awareness" graphs and very quickly see that the numbers and lines are just made-up, or that the y-axis has an ill-defined unit of measurement (“effective compute”, which is relative the tasks/problems compute is used for) or no unit of measurement (just “orders of magnitude”, but orders of magnitude of what?) and also no numbers. Plus other ridiculous features, such as claiming that GPT-4 is an AGI.
With AI 2027, it takes more like 10-20 minutes to see that the whole thing is just based on a few guys’ gut intuitions and nothing else. There are other glaring problems in EA discourse around AGI that take more time to explain, such as objections around benchmark construct validity or criterion validity. Even in cases where errors are clear, straightforward, objective, and relatively quick and simple to explain (see below), people often just ignore it when someone points them out. More complex or subtle errors will probably never be considered, even if they are consequential.
The EA community doesn’t have any analogue of peer review — or it just barely does — where people play the role of rigorously scrutinizing work to catch errors and make sure it meets a certain quality threshold. Some people in the community (probably a minority, but a vocal and aggressive minority) are disdainful of academic science in general and peer review in particular, and don’t think peer review or an analogue of it would actually be helpful. This makes things a little more difficult.
I recently caught two methodological errors in a survey question asked by the Forecasting Research Institute. Pointing them out was an absolutely thankless task and was deeply unpleasant. I got dismissed and downvoted, and if not for titotal’s intervention one of the errors probably never would have gotten fixed. This is very discouraging.
I’m empathetic to the fact that producing research or opinion writing and getting criticized to death also feels deeply unpleasant and thankless, and I’m not entirely sure on the nuances of how to make both sides of the coin feel rewarded rather than punished, but surely there must be a way. I’ve seen it work out well before (and it’s not like this is a new problem no one has dealt with before).
The FRI survey is one example, but one of many. In my observation, people in the EA community are not receptive to the sort of scrutiny that is commonplace in academic contexts. This could be anything from correcting someone on a misunderstanding of the definitions of technical terms used in machine learning or pointing out that Waymo vehicles still have a human in the loop (Waymo calls it "fleet response"). The community pats itself on the back for "loving criticism". I don’t think anybody really loves criticism — only rarely — and maybe the best we can hope for is to begrudgingly accept criticism. But that involves setting up a social and maybe even institutional process of criticism that currently doesn’t exist in the EA community.
When I say "not receptive", I don’t just mean that people hear the scrutiny and just disagree — that’s not inherently problematic, and could be what being receptive to scrutiny looks like — I mean that, for example, they downvote posts/comments and engage in personal insults or accusations (e.g. explicit accusations of "bad faith", of which there is one in the comments on this very post), or other hostile behaviour that discourages the scrutiny. Only my masochism allows me to continue posting and commenting on the EA Forum. I honestly don’t know if I have the stomach to do this long-term. It's probably a bad idea to try.
The Unjournal seems like it could be a really promising project in the area of scrutiny and sober second thought. I love the idea of commissioning outside experts to review EA research. I think for organizations with the money to pay for this, this should be the default.
I’ll say just a little bit more on the topic of the precautionary principle for now. I have a complex multi-part argument on this, which will take some explaining that I won’t try to do here. I have covered a lot of this in some previous posts and comments. The main three points I’d make in relation to the precautionary principle and AGI risk are:
Near-term AGI is highly unlikely, much less than a 0.05% chance in the next decade
We don’t have enough knowledge of how AGI will be built to usefully prepare now
As knowledge of how to build AGI is gained, investment into preparing for AGI becomes vastly more useful, such that the benefits of investing resources into preparation at higher levels of knowledge totally overwhelm the benefits of investing resources at lower levels of knowledge
Is this something you're willing to bet on?
In principle, of course, but how? There are various practical obstacles such as:
If it’s a bet that takes a form where if AGI isn’t invented by January 1, 2036, people have to pay me a bunch of money (and vice versa), of course I’ll accept such bets gladly in large sums.
I would also be willing to take bets of that form for good intermediate proxies for AGI, which would take a bit of effort to figure out, but that seems doable. The harder part is figuring out how to actually structure the bet and ensure payment (if this is even legal in the first place).
From my perspective, it’s free money, and I’ll gladly take free money (at least from someone wealthy enough to have money to spare — I would feel bad taking it from someone who isn’t financially secure). But even though similar bets have been made before, people still don’t have good solutions to the practical obstacles.
I wouldn’t want to accept an arrangement that would be financially irrational (or illegal, or not legally enforceable), though, and that would amount to essentially burning money to prove a point. That would be silly, I don’t have that kind of money to burn.
Also, if I were on the low probability end of a bet, I'd be more worried about the risk of measurement or adjudicator error where measuring the outcome isn't entirely clear cut. Maybe a ruleset could be devised that is so objective and so well captures whether AGI exists that this concern isn't applicable. But if there's an adjudication/error error risk of (say) 2 percent and the error is equally likely on either side, it's much more salient to someone betting on (say) under 1 percent odds.
It seems plausible that there could be significant adverse effects on AI Safety itself. There's been an increasing awareness of the importance of policy solutions, whose theory of impact requires support outside the AI Safety community. I think there's a risk that AI Safety is becoming linked in the minds of third parties with a belief in AGI imminence in a way that will seriously if not irrevocably damage the former's credibility in the event of a bubble / crash.
One might think that publicly embracing imminence is worth the risk, of course. For example, policymakers are less likely to endorse strong action for anything that is expected to have consequences many decades in the future. But being perceived as crying wolf if a bubble pops is likely to have some consequences.
I might be missing the point, but I'm not sure I see the parallels with FTX.
With FTX, EA orgs and the movement more generally relied on the huge amount of funding that was coming down the pipe from FTX Foundation and SBF. When all that money suddenly vanished, a lot of orgs and orgs-to-be were left in the lurch, and the whole thing caused a huge amount of reputational damage.
With the AI bubble popping... I guess some money that would have been donated by e.g. Anthropic early employees disappears? But it's not clear that that money has been 'earmarked' in the same way the FTX money was; it's much more speculative and I don't think there are orgs relying on receiving it.
OpenPhil presumably will continue to exist, although it might have less money to disburse if a lot of it is tied up in Meta stock (though I don't know that it is). Life will go on. If anything, slowing down AI timelines will probably be a good thing.
I guess I don't see how EA's future success is contingent on AI being a bubble or not. If it turns out to be a bubble, maybe that's good. If it turns out not to be a bubble, we sure as hell will have wanted to be on the vanguard of figuring out what a post-AGI world looks like and how to make it as good for humanity as possible.
This is directly answered in the post. Edit: Can you explain why you don’t find what is said about this in the post satisfactory?
You do address the FTX comparison (by pointing out that it won't make funding dry up), that's fair. My bad.
But I do think you're make an accusation of some epistemic impropriety that seems very different from FTX - getting FTX wrong (by not predicting its collapse) was a catastrophe and I don't think it's the same for AI timelines. Am I missing the point?
The point of the FTX comparison is that, in the wake of the FTX collapse, many people in EA were eager to reflect on the collapse and try to see if there were any lessons for EA. In the wake of the AI bubble popping, people in EA could either choose to reflect in a similar way, or they could choose not to. The two situations are analogous insofar as they are both financial collapses and both could lead to soul-searching. They are disanalogous insofar as the AI bubble popping won’t affect EA funding and won’t associate EA in the public’s mind with financial crimes or a moral scandal.
It’s possible in the wake of the AI bubble popping, nobody in EA will try to learn anything. I fear that possibility. The comparisons I made to Ray Kurzweil and Elon Musk show that it is entirely possible to avoid learning anything, even when you ought to. So, EA could go multiple different ways with this, and I’m just saying what I hope will happen is the sort of reflection that happened post-FTX.
If the AI bubble popping wouldn’t convince you that EA’s focus on near-term AGI has been a mistake — or at least convince you to start seriously reflecting on whether it has been or not — what evidence would convince you?