Thanks for your answer.
other smart people disagree
I'm generally against this sort of appeal to authority. While I'm open to hear the arguments of smart people, we should evaluate those arguments themselves and not the people giving them. So far, I've heard no argument that would change my opinion on this matter.
You seem to make a similar argument in your other comment:
[...] But when I ask myself what evidence I have for "there are not >20 similar sized jumps before AGI" I come up short. I don't necessarily think the burden of proof here is actually on people arguing that the chance of AGI in the next decade is non-negligible though: it's a goal of some serious people within the relevant science [...]
Again, I think just because there are serious people with this goal, that doesn't mean it is a justified belief. As you say yourself, you can't find evidence for your view. Extraordinary claims require extraordinary evidence, and the burden of proof lays on the person making the claim. That some serious/smart people believe in it is not enough evidence.
1%
I want to stress that even if we gave AGI/ASI a 1% probability in the next decade, my other point that AI safety work is not tractable and not neglected still stands, and it is thus not a good intervention that people in EA should focus on.
It is true that Lovelace and Menabrea should have assumed a credible chance of rapid progress. Who knows, maybe if they had had the right resources and people, we could have had computers much earlier than we ultimately had.
But when talking about ASI, we are not just talking about rapid progress, we are talking about the most extreme progress imaginable. Extraordinary claims require extraordinary evidence, and so forth. We do not know what breakthroughs ASI requires, nor do we know how far we are from it.
It all comes down to the question of whether the current tech is relevant for ASI or not. In my estimation, it is not – something else entirely is required. The probability for us discovering that something else just now is low.
While it might feel to you that AI progress has been rapid in the past decade, most innovations behind it such as neural networks, gradient descent, backpropagation, and the concept of language models are very old innovations. The only major innovation in the past decade is the Transformer architecture from 2017, and almost everything else is just incremental progress and scaling on larger models and datasets. Thus, the pace of AI architecture development is very slow and the idea that a groundbreaking new AGI architecture will surface has a low probability.
ASI is the ultimate form of AI and in some sense computer science as a whole. Claiming that we will reach it just because we've just got started in computer science seems premature, akin to claiming that physics is soon solved just because we've made so much progress recently. Science (and AI in particular) is often compared to an infinite ladder: you can take as many steps as you like, and there will still be infinite steps ahead. I don't believe there are literally infinite steps to ASI, but assuming there must be only a few steps ahead just because there are a lot of steps behind is a fallacy.
I was recently reading Ada Lovelace's "Translator's Notes" from 1843, and came across this timeless quote (emphasis original):
It is desirable to guard against the possibility of exaggerated ideas that might arise as to the powers of the Analytical Engine. In considering any new subject, there is frequently a tendency, first, to overrate what we find to be already interesting or remarkable; and, secondly, by a sort of natural reaction, to undervalue the true state of the case, when we do discover that our notions have surpassed those that were really tenable.
This is a comment to the text by Luigi Menabrea she was translating, in which he was hyping that the "conceptions of intelligence" could be encoded into the instructions of the Analytical Engine[1]. Having a much better technical understanding of the machine than Menabrea, Lovelace was skeptical of his ideas and urged him to calm down.
The rest of their discussion is much more focused on concrete programs the machine could execute, but this short quote stroke me as very familiar of our current discussion. There existed (some level of) scientific discussion of artificial intelligence in 1840s, and their talking points seem so similar to ours, with some hyping and others being skeptical!
From the perspective of Lovelace and Menabrea, computer science progressed incredibly fast. Babbage's Analytical Engine was a schematic for a working computer that was much better than earlier plans such as the Differential Engine. Designing complex programs became possible. I can feel their excitement while reading their texts. But even despite this, it took a hundred years until ENIAC, the first general purpose digital computer, was built in 1945. The fact that a field progresses fast in its early days does not mean much when predicting its progress in future.
The quote she was commenting: "Considered under the most general point of view, the essential object of the machine being to calculate, according to the laws dictated to it, the values of numerical coefficients which it is then to distribute appropriately on the columns which represent the variables, it follows that the interpretation of formulae and of results is beyond its province, unless indeed this very interpretation be itself susceptible to expression by means of the symbols which the machine employs. Thus, although it is not itself the being that reflects, it may yet be considered as the being which executes the conceptions of intelligence."
Since you ask the viewpoint of those who disagree, here is a summary of my objections to your argument. It consists of two parts, first my objection to your probability of AI risk and then my objection to your conclusion.
- It’s just a matter of time until humanity develops artificial superintelligence (ASI). There’s no in-principle barrier to such technology, nor should we by default expect sociopolitical barriers to automatically prevent the innovation.
- Indeed, we can’t even be confident that it’s more than a decade away.
- Reasonable uncertainty should allow at least a 1% chance that it occurs within 5 years (let alone 10).
A reasonable prior is that we will not develop ASI in near future (out of all possible decades, each single decade has a very small probability of ASI being developed, way less than 1%). To overcome this prior, we would need evidence. However, there is little to no evidence that suggests that any AGI/ASI technologies are possible in near future.
It is clear that our current LLM tech is not sufficient for AGI, lacking several properties that an AGI would require, such as learning-planning[1]. Since the current progress is not going towards an AGI, it does not count as good evidence for AGI technology surfacing in near future.
- We should not neglect credible near-term risks of human disempowerment or even extinction. Such risks warrant urgent further investigation and investment in precautionary measures.
- If there’s even a 1% chance that, within a decade, we’ll develop technology that we can’t be confident humanity would survive—that easily qualifies as a “credible near-term risk” for purposes of applying this principle.
I'm a firm believer of the neglectedness, tractability and importance framework whenever deciding on possible interventions. Therefore, if the question is should we neglect a risk, first thing to ask is, do others neglect it. In the case of AI risk, the answer is, in my opinion, no. AI risk is not neglected. It is, in fact, taken very seriously by major AI companies, numerous other organizations, and even some governments. AI is researched in almost every university on our planet, and massive funds are used for AI safety research. So I believe AI risk fails the neglectedness criterion.
But even more crucially, I think it also fails tractability. Because the AGI technology does not exist, we cannot research it. Most so called "AI safety research" focuses on unimportant sidetracks that do not have any measurable effect on AI risk. Similarly, it is very difficult to establish any governmental policy to limit AI development, as we do not even know what kind of technology we need to regulate aside from a blanket ban on AI research, which most our politicians correctly deem would be an overreaching and harmful policy, since current AI tech is harmless from the X-risk viewpoint (and there would be no way out of that ban since we cannot research safety of non-existing tech).
I do not believe AI risk is important as there is no good reason to believe we will develop ASI in near future. But even if we believed so, it fails the two other criteria of the ITN framework and thus would not be a good target for interventions.
Learning-planning is what I call the ability to assess one's own abilities and efficiently learn missing abilities in a targeted way. Currently, machine learning algorithms are extremely inefficient, and models lack introspection capabilities required to assess missing abilities.
AGI is a pretty meaningless word as people define it so differently (if they bother to define it at all). I think people should more accurately describe what they mean it when they use it.
In your case, since automated AI research is what you care about, it would make most sense to forecast that directly (or some indicator assuming it is a good indicator). For automated research to be useful, it should produce some significant and quantifiable breakthroughs. How this should exactly be defined is up for debate and would require a lot of work and careful thoughts, which sadly isn't given for an average Metaculus question.
To give an example for how difficult it is to define such a question properly, look a this Metaculus forecast that concerns AI systems that can design other AI systems. It has the following condition:
This question will resolve on the date when an AI system exists that could (if it chose to!) successfully comply with the request "build me a general-purpose programming system that can write from scratch a deep-learning system capable of transcribing human speech."
In the comment section, there are people arguing that this condition is already met. It is in fact not very difficult to train an AI system (it just requires a lot of compute). You can just pull top ASR datasets from Huggingface, use a <100 hundred line standard training script for a standard neural architecture, and you have your deep-learning system capable of transcribing human speech, completely "from scratch". Any modern coding LLM can write this program for you.
Adding the additional bootstrapping step of first training a coding model and then training the ASR model is no issue, just pull standard pretraining and coding datasets and use the similar procedure. (Training coding LLMs is not practical for most people since it requires an enormous amount of compute, but this is not relevant for the resolve condition.)
Of course, none of this is really useful, because while you can do what the Metaculus question asks, all this can do is train subpar models with standard architectures. So I think some people interpret the question differently. Maybe they take "from scratch" to mean that the neural architecture should be novel, designed anew by the AI. That would indeed be much more reasonable, since that kind of system could be used to do research on possible new architectures. This is supported by the following paragraph in the background section (emphasis original):
If an AI/ML system could become competent enough at programming that it could design a system (to some specification) that can itself design other systems, then it would presumably be sophisticated enough that it could also design upgrades or superior alternatives to itself, leading to recursive self-improvement that could dramatically increase the system's capability on a potentially short timescale.
The logic in this paragraph does not work. It assumes that a system that can design a system to some specification (and this system could design a system...) can also design upgrades and this would lead to recursive self-improvement. But I cannot see how it follows that designing a system based on a specification (e.g., a known architecture) leads to the ability to design a system without a specification.
Recursive self-improvement would also require that the new designed system is better than the old system, but this is by default not the case. Indeed, it is very easy to just produce randomized neural architectures that work but are just bad. Any modern coding LLM can write you a code for a hallucinated architecture. The ability to design a system is not the same as the ability to design a "good" system, which itself is a very difficult thing to define.
The bottom line here is that this question is written with unstated assumptions. One of these assumptions seems to be that the system can design a system better than itself, but this is not included in the resolve condition. Since we can only guess what the original intention was, and there certainly seem to be multiple interpretations among the forecasters, this question as a whole doesn't really forecast anything. It would require a lot of work and effort to define these questions properly to avoid these issues.
I do see the quote. It seems there is something unclear about its meaning. A single neural net trained on multiple tasks is not a "cobbled together set of sub-systems". Neural nets are unitary systems in the sense that you cannot separate them into multiple subsystems, as opposed to ensemble systems that do have clear subsystems.
Modern LLMs are a good example of such unitary neural nets. It is possible to train (or fine-tune) an LLM for certain tasks, and the same weights would perform all those tasks without any subsystems. Due to the generalization property of neural network training, the LLM might also be good at tasks resembling the tasks in the training set. But this is quite limited: in fact, fine-tuning on one task probably makes the network worse at non-similar tasks.
Quite concretely speaking, it is imaginable that someone could take an existing LLM, GPT-5 for example, and fine-tune it to solve SAT math questions, Winogrande schemas, and play Montezuma's Revenge. The fine-tuned GPT-5 would be a unitary system: there wouldn't be a separate Montezuma subsystem that could be identified from the network, the same weights would handle all of those tasks. And the system could do all the things they mention ("explain its reasoning on an SAT problem or Winograd schema question, or verbally report its progress and identify objects during videogame play").
My critique is based on how they have formulated their Metaculus question. Now, it is possible that some people interpret it differently than I and assume things that are not explicitly said in their formulation. In that case, the whole forecast becomes unreliable as we cannot agree that all forecasters have the same interpretation, in which case we couldn't use the forecast for argumentation at all.
The whole point of having the 4 disparate indicators is that they have to be done by a single unified system (not specifically trained for only those tasks)[1]. Such a system would implicitly be general enough to do many other tasks. Ditto with the Strong AGI question.
If you read the formulation carefully, you'll notice that it actually doesn't say anything about the system not being trained specifically for those tasks. It only says that it must be a single unified system. It is entirely possible to train a single neural network on four separate tasks and have it perform well on all of those without it generalizing well on other categories of tasks.
Amusingly they even exclude introspection from their definition even though that is a property that a real general intelligence should have. A system without some introspection couldn't know what tasks it cannot perform or identify flaws in its operation, and thus couldn't really learn new capabilities in a targeted way. They quite explicitly say that its reasoning or reports on its progress can be hallucinated.
That is what both the Turing Test questions are all about! (Look at the success conditions in the fine print.)
Their conditions are really vague and leave a lot of practicalities out. There are a lot of footguns in conducting a Turing test. It is also uncertain what does passing a Turing test, even if it is indeed rigorous, mean. It's not clear that this would imply the sort of dangerous consequences you talk about in your post.
Thanks of pointing this out. There is indeed a reasoning step missing from the text. Namely: such AGI would be able to automate further AI development, leading to rapid recursive self-improvement to ASI (Artificial Superintelligence). And it is ASI that will be lethally intelligent to humanity (/all biological life). I've amended the text.
Because the forecasts do not concern a kind of system that would be able to do recursive self-improvements (none of the indicators have anything to do with it), I don't see how this reasoning can work.
The conclusions of this post are based on a misunderstanding on the definition of AGI. The linked forecasts mainly contain bad indicators of AGI instead of a robust definition. None of these indicators actually imply that the "AGI" meeting them would be dangerous or catastrophic to humanity and do not merit the sensationalist tone of the text.
Indicators
The "Weak AGI" Metaculus question includes four indicators:
Aside from the Turing test, the three other criteria are simple narrow tasks that contain no element of learning[2], there is nothing to indicate that such a system would be good at any other task. Since these tasks are not dangerous, a system able to perform them wouldn't be dangerous either, unless we take into account further assumptions, which the question does not mention. Since training a model on specific narrow tasks is much easier than creating a true AGI, it is to be expected that if someone creates such as system, it is likely not an AGI.
It is also not only this "Weak AGI" question that is like this. In fact, the "Strong AGI" question from Metaculus is also simply a list of indicators, none of which implies any sort of generality. Aside from an "adversarial" Turing test, it contains the task of assembling a model car with instructions, performing programming challenges and answering multiple-choice questions, none of which requires the model to be able to generalize outside of these tasks.
It would not surprise me if some AI lab specifically made a system that performs well on these indicators just to gain media attention for their supposed "AGI".
Turing Test
In addition to the narrow tasks, the other indicator used by these forecasts is the Turing test. While the Turing test is not a narrow task, it has a lot of other issues: the result is highly dependent on the persons conducting the test (including the interrogator and the human interviewee) and their knowledge of the system and of each other. While an ideal adversarial Turing test would be a very difficult task for an AI system, ensuring these ideal conditions is often not feasible. Therefore, I'm certainly going to expect news that AI systems will pass some form of the adversarial test, but this is to be taken only as limited evidence of the system's generality.
It puzzles me why they include a range of years. Since models are trained on vast datasets, it is very likely that they have seen most SAT exams from this range. It therefore makes no sense to use an old exam as a benchmark.
Montezuma's Revenge contains an element of "learning" the game in a restricted amount of time. However, the question fails to constrain this by any means: for example, training the model with very similar games and then fine-tuning it with less than 100 hours of Montezuma's Revenge would be enough for passing the criterion.
It seems to me that you are missing my point. I'm not trying to dismiss or debunk Aschenbrenner. My point is to call out that what he is doing is harmful to everyone, including those who believe AGI is imminent.
If you believe that AGI is coming soon, then shouldn't you try to convince other people of this? If so, shouldn't you be worried that people like Aschenbrenner ruin that by presenting themselves like conspiracy theorists?
We must engage at the object level. [...] We will have plenty of problems with the rest of the world doing its standard vibes-based thinking and policy-making. The EA community needs to do better.
Yes! That is why what Aschenbrenner is doing is so harmful, he is using an emotional or narrative argument instead of a real object-level argument. Like you say, we need to do better.
The author's object-level claim is that they don't think AGI is immanent. Why? How sure are you? How about we take some action or at least think about the possibility [...]
I have read the technical claims made by Aschenbrenner and many other AI optimists, and I'm not convinced. There is no evidence for any kind of general intelligence abilities surfacing in any of the current AI systems. People have been trying to do that for decades, and for the part couple of years, but there has been almost no progress on that front at all (in-context learning is one of the biggest ones I can think, and it can hardly even be called learning). While I do think that some action can be taken, what Aschenbrenner suggests is, as I iterate in my text, too much given our current evidence. Extraordinary claims require extraordinary evidence, as it is said.
Sorry for answering late.
My opinions are mostly the same. Last years have seen mostly incremental improvements in AI capabilities, with no development on areas I believe are crucial for AGI, such as considerably more efficient training algorithms and introspection. The current trend of using exponentially more compute without seeing the same increase in capabilities (outside of few exceptions such as coding[1]) is a demonstration of our lack of development: algorithmic development should enable us to achieve more with less compute, which is not what we are seeing[2].
There are many groups taking AI risk seriously. This enforces my opinion that AI risk is not neglected. Since I also believe it is not tractable, it makes a poor choice for interventions. I believe this to be true regardless of what probability we assign for achieving AGI in near future.
I might write a longer follow-up post later that goes through these in more detail.
Mathematics and coding are examples of skills that can be automatically validated to some extent, enabling us to train them without a training corpus. However, most skills are not like this, and we are not seeing improvements on those areas. Since one of my research areas in computational creativity, one example where progress is lacking noticeably is creative writing. Creativity has indeed seemed to even taken a step backwards in case of some models. This is due to lack of suitable training material and the impossibility of automatically valuating creative text. Human-created corpora are expensive and we've ran out of them. I believe strong creativity is one of the key areas required to achieve AGI, and we are not seeing progress there.
There are some algorithmic improvements increasing efficiency, but most of them are kind of incremental development that gives small gains but not a breakthrough that would be required.