In 2012, Holden Karnofsky[1] critiqued MIRI (then SI) by saying "SI appears to neglect the potentially important distinction between 'tool' and 'agent' AI." He particularly claimed:
Is a tool-AGI possible? I believe that it is, and furthermore that it ought to be our default picture of how AGI will work
I understand this to be the first introduction of the "tool versus agent" ontology, and it is a helpful (relatively) concrete prediction. Eliezer replied here, making the following summarized points (among others):
- Tool AI is nontrivial
- Tool AI is not obviously the way AGI should or will be developed
Gwern more directly replied by saying:
AIs limited to pure computation (Tool AIs) supporting humans, will be less intelligent, efficient, and economically valuable than more autonomous reinforcement-learning AIs (Agent AIs) who act on their own and meta-learn, because all problems are reinforcement-learning problems.
11 years later, can we evaluate the accuracy of these predictions?
- ^
Some Bayes points go to LW commenter shminux for saying that this Holden kid seems like he's going places
My take is that both were fairly wrong.[1] AI is much more generally intelligent and single systems are useful for many more things than Holden and the tool AI camp would have predicted. But they are also extremely non-agentic.
(To me this is actually rather surprising. I would have expected agency to be necessary to get this much general capability.)
I'm tempted to call it a wash. But rereading Holden's writing in the linked post, it seems to be pretty narrowly arguing against AI as necessarily being agentic, which seems to have predicted the current world (though note there's still plenty of time for AIs to get agentic, and I still roughly believe the arguments that they probably will).
This seems unsurprising, tbh. I think everyone now should be pretty uncertain about how AI will go in the future.
I guess it depends on your priors or something. It's agentic relative to a rock, but, relative to an AI which can pass the LSAT, it's well below my expectations. It seems like ARC-Evals had to coax and prod GPT-4 to get it to do things it "should" have been doing with rudimentary levels of agency.