Exponential AI takeoff is a myth

Christoph Hartmann 🔸

Exponential AI takeoff is a myth

Christoph Hartmann 🔸

11 min readMay 31, 2023

Comments 11

Sorted by

New & upvoted

titotal

Hey, great post. I mostly agree with your points here, and agree that an intelligence explosion is incredibly unlikely, especially anytime soon.

I'm not too sure about the limits of algorithm point: My impression is that current AI architecture is incredibly inefficient at using data when compared to humans. So it seems like even if we hit the limit with current architecture, there's room to invent new algorithms that are better.

I'm interested in your expertise as a a computational neuroscientist, do you think there are any particular insights from that field that are applicable to these discussions?

Christoph Hartmann 🔸

Thanks for your thoughts! When writing this up I also felt that the algorithm one is the weakest one, so let me answer from two perspectives:

From the room to invent new algorithms: Convolutional neural networks have been around since the 80s, we've been using GPUs to run them since about 10 years. If there really would be huge potential left, I'd be a bit surprised that we didn't find it in the last 40 years already - we certainly had incentives because hardware was so slow and people had to optimize, but of course you never know. I tried to find a paper reviewing efficiency improvements of non-negative matrix factorization over time, I think that could be a fun guide, but couldn't find one.
From the brain perspective: Yes, it's puzzling that the brain can do all this on 12 watts power while OpenAI is using server farms that consume much much more than that. So somewhere there must be huge efficiency gains. Note that that's mostly on the training side - "evaluating" a network is pretty efficient as far as I know. For training, there could be different reasons:
- Transfer learning: Maybe the "computation of evolution" just "pre-programmed" our brain similar to how we use transfer learning. It's already pretty close to where we want it and we just need to fine tune. Transfer learning on neural networks is already pretty cheap today. One argument supporting this would be that many animals are perfectly functional from day 1 of their life without much learning. Of course not same level of intelligence, but still.
- Hardware: The brain doesn't run on silicone. We use a very very abstracted version of our brain and there is much more going on biologically. Some people argue that a lot of computation is already happening in the dendrites, maybe the morphology of neurons has effects on computation, maybe the specific nonlinearity applied by the neurons is more relevant than we think, ... . One way to try to adress this would be to build chips that are more similar ("neuromorphic") but I haven't seen much progress there
- Architecture: The brain isn't a CNN. This might be a good approximation for our sensory cortices but even there it's not the same. The brain is very recurrent, not feed-forward, and it can't send signals back through it's synapses and therefore can't implement backpropagation. Maybe we're just using the wrong architecture and if we find the right one it's going to go much faster. I did my PhD on something related to this and I gave up haha, but of course, I'm sure there are lots of things to be discovered here.

Thomas Larsen

Either way, both compute and algorithms, even if we make a magical breakthrough in quantum computing tomorrow, are in the end limited by data. DeepMind showed in 2022 (see also here) that more compute only makes sense if you have more data to feed it. So even if we get exponentially scaling compute and algorithms, that would only give us the current models faster, not better. So what are the limits of data?

AI scaling laws refer to a specific algorithm and so are not relevant for arguing against algorithmic progress. For example, humans are much more sample efficient than LLMs right now, and so are an existence proof for more sample efficient algorithms. I also am pretty sure that humans are far from the limits of intelligence -- neuron firing speeds are on the order of 1-100 Hz, while computers can run much faster than this. Moreover, the human brain has all sorts of bottlenecks like needing to fit through a mother's birth canal that an AI need not have, as well as all the biases that plague our reasoning.

Epoch estimates algorithmic improvements at .4 OOM / year currently, and I feel that it's hard to be confident either way about which direction this will go in the future. AI assisted AI research could dramatically increase this, but on the other hand, as you say, scaling could hit a wall.

I agree that I don't expect the exponential to hold forever, I expect the overall growth to look more like a sigmoid, as described here (though my best guess parameters to this model are different than the default ones). Where I disagree is that I expect the sigmoid to top out at far stronger than human level.

Christoph Hartmann 🔸

Thanks for this, Thomas! See my answer to titotal addressing the algorithm efficiency question in general. Note that if we would follow the hand-wavy "evolutional transfer learning" argument that would weaken the existence proof for sample-efficiency of the human brain. The brain isn't a "general-purpose Tabula Rasa". But I do agree with you that probably we'll find a better algorithm that doesn't scale this badly with data and can extract knowledge more efficiently.

However, I'd argue that as before, even if we find a much much more efficient algorithm, we are in the end limited by the growth of knowledge and the predictability of our world. Epoch estimates that we'll run out of high-quality text data next year, which I would argue is the most knowledge-dense data we have. Even if we find more efficient algorithms, once AI has learnt all this text, it'll have to start generating new knowledge itself, which is much more cumbersome thant "just" absorbing existing knowledge.

[anonymous]

I've been thinking about this specific idea:

Intuitively, I think it makes sense that data should be the limiting factor of AI growth. A human with an IQ of 150 growing up in the rainforest will be very good at identifying plants, but won’t all of a sudden discover quantum physics. Similarly, an AI trained on only images of trees, even with compute 100 times more than we have now, will not be able to make progress in quantum physics.

It seems to me that you're making the point that extreme out-of-distribution domains are unreachable by generalization (at least rapidly). Let's consider that humans actually went from only identifying plants to making progress in quantum physics. How did this happen?

Humans didn't do it all of a sudden. It was only possible in stepwise fashion spanning generations, and required building on past knowledge (the way to climb ten steps up the ladder is simply to climb one step at a time ten times over).
Human population increases meant that more people were working on learning new knowledge.
Humans had to (as you point out) gather new information (not in our rainforest training set) in order to learn new insights.
Humans often had to test their insights to gain practical knowledge (which you also point out with respect to theoretical vs experimental physics)

If we assume that generating high-quality synthetic data would not allow for new knowledge outside of the learned domain, you would necessarily have to gather new information that humans have not gathered yet to not hit the data ceiling. As long as humans are required to gather new information, it's reasonable to assume that sustainable exponential improvement is unlikely, since human information-gathering speed would not increase in tandem. Okay, let's remove the human bottleneck. In this case, an exponentially improving AI would have to find a way to gather information from the outside world with exponentially increasing speeds (as well as test insights/theories at those speeds). Can you think of any way this would be possible? Otherwise, I find it hard not to reach the same conclusion as you.

Christoph Hartmann 🔸

Thanks for taking the time to formalizing this a bit more. I think you're capturing my ideas quite well and indeed I can't think of ways how this would scale exponentially. Your point on "let's remove the human bottleneck" goes a bit in the direction of the last simulation paragraph where I suggest that you could parallelize knowledge acquisition. But as I argue there I think that's unrealistic to scale exponentially.

In general, I think I focused too much on the robotics examples when trying to illustrate that generating new knowledge takes time and is difficult but the same applies of course also to performing any kind of other experiment that an AI would have to do such as generating knowledge on human psychology by doing experiments with us, testing new training algorithms, performing experiments on quantum physics for chip research, etc.

SteveZ

Hi, thanks for writing this up. I agree the macro trends of hardware, software, and algorithms are unlikely to hold true indefinitely. That said, I mostly disagree with this line of thinking. More precisely I find it unconvincing because there just isn’t a lot of empirical evidence for or against these macro trends (e.g. natural limits to the growth of knowledge), so I don’t really understand how you can use it to rule out certain endpoints as possibilities. And when I see an industry exec make a statement about Moore’s Law I generally assume it is only to reassure investors that the company is on the right path this quarter rather than making a profound forward-looking statement about the future of computing. For example since that 2015 quote, Intel lost the mobile market, fell far behind on GPUs, and is presently losing the datacenter market.

There are a number of well-funded AI hardware startups right now, and a lot of money and potential improvements on hardware roadmaps including but not limited to: exotic materials, 3D stacking, high-bandwidth interconnects, new memory architectures, and dataflow architecture. On the AI side techniques like distillation and dropout seem to be effective at allowing much smaller models to perform nearly as well. Altogether I don’t know if this will be enough to keep Moore’s law (and whatever you’d call the superlinear trend of AI models) going for another few decades but I don’t think I’d bet against it, either.

Christoph Hartmann 🔸

Hey Steve, thanks for those thoughts! I think I'm not more qualified than the wikipedia community to argue for or against Moore's law, that's why I just quoted them. So can't give more thoughts on that unfortunately.

But even if Moore's law would continue forever, I think that the data argument would kick in. If we have infinite compute but limited information to learn from, that's still a limited model. Applying infinite compute to the MNIST dataset will give you a model that won't be much better than the latest Kaggle competitor on that dataset.

So then we end up again at the more hand-wavy arguments for limits to the growth of knowledge and predictability of our world in general. Would be curious where I'm losing you there.

Marcel2

I think there is plenty of room for debate about what the curve of AI progress/capabilities will look like, and I mostly skimmed the article in about ~5 minutes, but I don't think your post's content justified the title ("exponential AI takeoff is a myth"). "Exponential AI takeoff is currently unsupported" or "the common narrative(s) for exponential AI takeoff is based on flawed premises" are plausible conclusions from this post (even if I don't necessarily agree with them), but I think the original title would require far more compelling arguments to be justified.

(I won't get too deep into this, but I think it's plausible that there is significant "methodological overhang": humans might just struggle to make progress in some fields of research—especially softer sciences and theory-heavy sciences—because principal-agent problems in research plague the accumulation of reliable knowledge through non-experimental methods.)

Christoph Hartmann 🔸

Hi Harrison, thanks for stating what I guess a few people are thinking - it's a bit of a clickbait title. I do think though that the non-exponential growth is much more likely than exponential growth just becuase exponential takeoff would require no constraints on growth while it's enough if one constraint kicks in (maybe even one I didn't consider here) to stop exponential growth.

I'd be curious on the methodological overhang though. Are you aware of any posts / articles discussing this further?

Marcel2

I haven't looked very hard but the short answer is no, I'm not aware of any posts/articles that specifically address the idea of "methodological overhang" (a phrase I hastily made up and in hindsight realize may not be totally logical) as it relates to AI capabilities.

That being said, I have written about the possibility that our current methods of argumentation and communication could be really suboptimal, here: https://georgetownsecuritystudiesreview.org/2022/11/30/complexity-demands-adaptation-two-proposals-for-facilitating-better-debate-in-international-relations-and-conflict-research/

Comments

More from the author

Salary Sacrifices Are Donations. Let's Treat Them as Such.

Christoph Hartmann 🔸·2mo ago·5m read

144

You should probably track your time (and it just got easier)

Christoph Hartmann 🔸·9mo ago·2m read

Launching Euzoia: Beeminder for Effective Charities

Christoph Hartmann 🔸, Cameron.K·4mo ago·1m read

Curated and popular this week

Hard-to-reverse decisions destroy option value

Stefan_Schubert·9y ago·Curated 3d ago·14m read

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·4d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. Crossposted to LessWrong. ...

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·6d ago·2m read

TL;DR: Marginal Victories is a new initiative to provide 1:1 career advising, opportunities, and resources for people exploring high-leverage U.S. democracy preservation and political work. Built by impact-oriented people doing pro-democracy work, the early MVP is now up at marginalvictories.org. Fill out the 10-minute form now to receive these resources as they become available over the next few...

Recent opportunities to take action

Job: Executive Director of CEEALAR (EA Hotel)

CEEALAR·10h ago·3m read

Amsterdam Insect Protest

Bentham's Bulldog·1d ago·3m read

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·6d ago·2m read

Christoph Hartmann 🔸

Thanks for your thoughts! When writing this up I also felt that the algorithm one is the weakest one, so let me answer from two perspectives:

From the room to invent new algorithms: Convolutional neural networks have been around since the 80s, we've been using GPUs to run them since about 10 years. If there really would be huge potential left, I'd be a bit surprised that we didn't find it in the last 40 years already - we certainly had incentives because hardware was so slow and people had to optimize, but of course you never know. I tried to find a paper reviewing efficiency improvements of non-negative matrix factorization over time, I think that could be a fun guide, but couldn't find one.
From the brain perspective: Yes, it's puzzling that the brain can do all this on 12 watts power while OpenAI is using server farms that consume much much more than that. So somewhere there must be huge efficiency gains. Note that that's mostly on the training side - "evaluating" a network is pretty efficient as far as I know. For training, there could be different reasons:
- Transfer learning: Maybe the "computation of evolution" just "pre-programmed" our brain similar to how we use transfer learning. It's already pretty close to where we want it and we just need to fine tune. Transfer learning on neural networks is already pretty cheap today. One argument supporting this would be that many animals are perfectly functional from day 1 of their life without much learning. Of course not same level of intelligence, but still.
- Hardware: The brain doesn't run on silicone. We use a very very abstracted version of our brain and there is much more going on biologically. Some people argue that a lot of computation is already happening in the dendrites, maybe the morphology of neurons has effects on computation, maybe the specific nonlinearity applied by the neurons is more relevant than we think, ... . One way to try to adress this would be to build chips that are more similar ("neuromorphic") but I haven't seen much progress there
- Architecture: The brain isn't a CNN. This might be a good approximation for our sensory cortices but even there it's not the same. The brain is very recurrent, not feed-forward, and it can't send signals back through it's synapses and therefore can't implement backpropagation. Maybe we're just using the wrong architecture and if we find the right one it's going to go much faster. I did my PhD on something related to this and I gave up haha, but of course, I'm sure there are lots of things to be discovered here.

Exponential AI takeoff is a myth

Exponential AI takeoff is a myth

TL;DR

Disclaimer

Introduction

Nothing grows exponentially indefinitely

We’re reaching the limits of Moore’s law

We will probably reach the limits of algorithms

We’re reaching the limits of training data

There are natural limits to the growth of knowledge

We can’t simulate knowledge acquisition

There are natural limits to the predictability of the world

AI will be very useful and maybe even smarter than us, but it won’t overpower us overnight