Yarrow Bouchard 🔸

1467 karmaJoined Canadastrangecosmos.substack.com

Bio

Pronouns: she/her or they/them. 

Parody of Stewart Brand’s whole Earth button.

I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now I’m trying to figure out where effective altruism can fit into my life these days and what it means to me.

I write on Substack, and used to write on Medium.

Sequences
2

Criticism of specific accounts of imminent AGI
Skepticism about near-term AGI

Comments
716

Topic contributions
13

The blog post by the Australian AI safety organization says, “We apply METR’s time-horizon methodology…” How would this address the criticisms raised of METR’s methodology?

At a glance, the FutureTech pre-print makes some interesting choices, e.g., task quality is only scored up to above-average and above-average gets a perfect score, and acknowledges some of the limitations with their methodology, e.g., all tasks used for this experiment must contain all relevant information in the LLM prompt. (Is that realistic for most work tasks?) I wonder if this pre-print will be submitted for publication in a journal? FutureTech seems to be one of those weird MIT hybrids between an academic research group and a management consultancy. I’m not sure if they’ve ever published a peer-reviewed paper.

Someone could take the time to do a deep dive into the FutureTech pre-print and write a review, but I wonder if that’s a good use of anyone’s time? Is there a reason to think this group publishes high-quality research that is worth getting into?

If someone thinks it’s worthwhile, and they also think the pre-print is unlikely to be submitted for peer review, one option would be to ask the EA organization called The Unjournal to commission a review by an external expert. 

Hi Kouadio. Just want to let you know that your comments don't have paragraph breaks between the paragraphs. Maybe you are copying and pasting from another app and the formatting is getting messed up? I'm just saying this because the text looks like it's all in one big block and that makes it harder to read. I want to make sure you get a fair shot at saying what you want to say, and fixing this formatting issue will make people more likely to read your comments.

The AI revenue growth we've seen so far is compatible with several different explanations, including an AI investment bubble and narrow AI applications that are economically useful but will not lead to AGI anytime soon. Professional investors and financial analysts are generally split between these two camps. Only a small minority believe in near-term AGI.

Some criticisms of the famous METR time horizons graph:

  • As you mentioned, some of the problems and limitations of the METR time horizons graph are sometimes (but not always) clearly disclosed by METR employees, including the CEO of METR. However, note the wide difference between the caveated description of what the graph says and the interpretation of the graph as a strong indicator of rapid, exponential improvement in general AI capabilities.
  • Gary Marcus, a cognitive scientist and AI researcher, and Ernest Davis, a computer scientist and AAAI fellow, co-authored a blog post on the METR graph that looks at how the graph was made and concludes that “attempting to use the graph to make predictions about the capacities of future AI is misguided”.
  • Nathan Witkin, a research writer at NYU Stern’s Tech and Society Lab, published a detailed breakdown of some of the problems with METR’s methodology. He concludes that it’s “impossible to draw meaningful conclusions from METR’s Long Tasks benchmark” and that the METR graph “contains far too many compounding errors to excuse”. Witkin calls out a specific tweet from METR, which presents the METR graph in the broad, uncaveated way that it’s often interpreted by believers in near-term AGI. He calls the tweet “an uncontroversial example of misleading science communication”. In a response to a comment on that post asking how much we should update our views based on the METR graph, Witkin responded, "to be very clear I am in fact claiming that the proper update is zero."

I'm just summarizing the conclusions here, not the substance of the critiques. I recommend that people go and read the critiques to how the authors reach these conclusions.

I guess the point of the expert survey you cited was to explain that it does not support the idea of near-term AGI, right? I was confused because the title and introduction strongly states that the evidence has turned in favour of near-term AGI, but then you say that 2 out of the 4 pieces of evidence you cite do not support the idea of near-term AGI. I think you're just trying to do a general survey of the evidence, both the convincing and unconvincing evidence, right?

I agree that Bio Anchors is also not convincing evidence of anything, for the reasons explained here.

Something I changed my mind about after looking into both the AI Impacts survey and the Forecasting Researching Institute's LEAP survey (as I wrote about here) is that survey results seem to be super sensitive to survey design, even choices in survey design that seem small to the designers, and that they don't anticipate having an impact. I'm not sure these kinds of surveys really matter that much, anyway, but I'm at least more interested in surveys where the designers are careful about these factors that can bias the results. The effects are not small, either. In one case, the result was 750,000 times higher or lower depending on how the question was posed.

Overall, this post is a bit weird because the title and intro make a super strong claim — the tables have turned! — but then the body doesn't cash the cheque that the title and intro write. The new evidence that has turned the tables on AI skepticism is just AI revenue and the METR graph? So, if you agree that the METR graph has been debunked at this point, then it's just AI revenue. And what does AI revenue really show? Can narrow AI not make a lot of money? Are you really prepared to defend that claim? Have at it!

Maybe the claim is something really specific, which is that if you take AI revenue growth over the last 3 years and extrapolate the same rate of growth indefinitely, you end up with some ridiculously large number, and for that number to be true, we would need to have something like AGI. But you can't just take any trend and extrapolate it indefinitely. You need to have some explanation of what's causing the trend and whether it will continue or not. When you step on the accelerator of your car, extrapolating that trend forward indefinitely means you'll eventually exceed the speed of light. But we don't just extrapolate things forward, we think about cause and effect.

You could look at all sorts of industries (like SaaS) or companies (like Tesla) during a few years when growth is super fast, extrapolate that forever, and conclude that one day they will account for 100% of gross world product and take over the entire world economy. But we assume this won't happen because we understand what will prevent this from happening, and we also don't know about anything that would cause it to happen. So, will AI revenue increase until the Singularity happens? That depends on the technology. So, what will happen with the technology? Now we're back to square one! Looking at a chart of AI revenue doesn't settle anything. Will the chart go asymptotic into AI heaven? Or will it level out, or even crash? The answer to that question is not in the chart. It's in the world.

Extrapolation of past trends with no causal explanation of why the trend will continue is not empiricism! It is mysticism! It amounts to saying: we don't know what's happening or why or how, but, somehow, we know what will happen. This is not science. This is not financial analysis. This is not anything.

A facetious graph from The Economist extrapolating when the first 14-bladed razor will arrive:

My own facetious graph:

(Why do you expect this trend not to continue?)

This is a beautifully written comment, and succinct, and funny, and true.

I would give EA much more grace if its self-image was the same as what I presume the Big Garden Birdwatch's self-image is. Part of what gets me tilted out of my mind about the EA community is when people express this almost messianic Chosen Ones self-image — which ties into the pseudo-religious aspect you mentioned.

The high-impact, low-probability logic of existential risk is hypnotically alluring. If a 1 in 1 quintillion chance of reducing existential risk is equivalent to 100 human lives, what does that imply in terms of your moral responsibility when discussing existential risk? If you have things to say that could cast doubt on existential risk arguments, should you self-censor and hold your tongue? If you speak out and you're wrong, it could be the moral equivalent of killing 100 people. Would it be okay to lie? To exaggerate? Why not? Wouldn't you lie or exaggerate to save 100 lives? If the Nazis knocked at your door, wouldn't you lie to save Anne Frank in the attic?

I don't think many people are actually outright lying when it comes to existential risk. But I do think people are self-censoring when it comes to criticism, and I do think people are willing to make excuses for really low-quality products like AI 2027 or 80,000 Hours' video on it because anything that builds momentum for existential risk fear is plausibly extremely high in expected value.

Much could be said in response to this comment. Probably the most direct and succinct response is my post “Unsolved research problems on the road to AGI”.

Largely for the reasons explained in that post, I think AGI is much less than 0.01% likely in the next decade.

Yarrow Bouchard 🔸
4
1
1
100% disagree

How much of a post are you comfortable for AI to write?

I will never let AI write a single sentence! I resent reading AI-generated writing passed off as written by a human, and I would never inflict this upon my readers.

I have found that the most common explanation for why people using AI for writing is a lack of self-confidence. I keep encouraging people to write in their own words and use their own voice because all the flaws of unpolished human writing are vastly preferable to chatbot writing.

Thank you for your supportive comment. I think David Mathers is an exceptionally and commendably valuable contributor to the EA Forum in terms of engaging deeply with the substance of arguments around AI safety and AGI forecasting. David engages in discussions with a high level of reasoning transparency, which I deeply appreciate. It isn’t always clear to me why people who fall on the opposite side of debates around AI safety and AGI forecasting believe what they do, and talking to David has helped me understand this better. I would love to have more discussions about these topics with David, or with interlocutors like him. I feel as though there is still much work to be done in bringing the cruxes of these debates into sharp relief.

The EA Forum has a little-used “Dialogues” feature that I think has some potential. Anyone who would be interested in having a Dialogue on AGI forecasting and/or AGI safety should send me a private message.

On to the rest of your comment:

I think the current investments in AGI safety will end up being wasted. I think it’s a bit like paying philosophers in the 1920s to think about how to mitigate social media addiction, years before the first proper computer was built, and even before the concept of a Turing machine was formalized. There is simply too much unknown about how AGI might eventually be built.

Conversely, investments in narrow, prosaic “AI safety” like making LLM chatbots less likely to give people dangerous medical advice are modestly useful today but will have no applicability to AGI much later on. Other than having the name “AI” in common and running on computers using probably some sort of connectionist architecture, I don’t think today’s AI systems will have any meaningful resemblance to AGI, if it is eventually created.

I can’t remember where — I thought it was maybe in a comment on this post, but apparently not — but I seem to recall someone on the EA Forum saying that MIRI or Yudkowsky deserved credit for correctly predicting the sort of “alignment” failures that modern AI systems like LLMs would exhibit. (If anyone remembers the specific comment I’m thinking of, please let me know.) I want to set the record straight and explain why this is not true.

Reinforcement learning was originally developed in the late 1970s and in the 1980s. Years before the founding of MIRI (originally called the Singularity Institute, and created with a different focus) and before Yudkowsky first wrote about “friendly AI”, RL researchers noticed the phenomenon of “reward hacking” or “specification gaming” (although these exact terms were not always used to describe it). One example is found in the 1998 paper “Learning to Drive a Bicycle using Reinforcement Learning and Shaping”. The authors created a bicycle riding simulation and tasked an RL agent with riding the bicycle toward a target. The RL agent found it could maximize reward by riding in circles around the target (page 6):

We agree with Mataric [Mataric, 1994] that these heterogeneous reinforcement functions have to be designed with great care. In our first experiments we rewarded the agent for driving towards the goal but did not punish it for driving away from it. Consequently the agent drove in circles with a radius of 20–50 meters around the starting point. Such behavior was actually rewarded by the reinforcement function…

One could cite many more examples like this.[1]

It is possible to mistake an awareness of well-known concepts in a field (such as RL or AI more generally) with prescience or insight. Readers of MIRI’s or Yudkowsky’s writing should be wary of this.

  1. ^

    A similar example to the one just given but using real robots made of Lego is found in the 2004 paper “Lego Mindstorms Robots as a Platform for Teaching Reinforcement Learning”. One robot learned to continually drive backwards and forwards along the same stretch of track to maximize reward (page 5):

    After some experimentation with the reinforcement signal, the reinforcement learning system was much more successful on the line-following task than on the walking task. 

    In the initial trials, the reinforcement signal used rewarded the robot with positive reinforcement for any action which led to the robot remaining on the track (measured by applying a threshold to the summed value of the light sensors). As the actions available did not provide an option for staying still, this was expected to lead to the robot moving forward along the path and eventually traversing the circuit. However the learning algorithm discovered that alternating turning left and right allowed the robot to reverse slowly in a straight line, and hence maximal reinforcement could be achieved by travelling along a straight section of line at the beginning of the track, and then reversing back along that same section of track.

There's an expert consensus that tobacco is harmful, and there is a well-documented history of tobacco companies engaging in shady tactics. There is also a well-documented history of government propaganda being misleading and deceptive, and if you asked anyone with relevant expertise — historians, political scientists, media experts, whoever — they would certainly tell you that government propaganda is not reliable.

But just lumping in "AI accelerationist companies" with that is not justified. "AI accelerationist" just means anyone who works on making AI systems more capable who doesn't agree with the AI alignment/AI safety community's peculiar worldview. In practice, that means you're saying most people with expertise AI are compromised and not worth listening to, but you are willing to listen to this weird random group of people, some of whom like Yudkowsky who have no technical expertise in contemporary AI paradigms (i.e. deep learning and deep reinforcement learning). This seems like a recipe for disaster, like deciding that capitalist economists are all corrupt and that only Marxist philosophers are worth trusting.

A problem with motivated reasoning arguments, when stretched to this extent, is that anyone can accuse anyone over the thinnest pretext. And rather than engaging with people's views and arguments in any serious, substantive way, it just turns into a lot of finger pointing.

Yudkowsky's gotten paid millions of dollars to prophesize AI doom. Many people have argued that AI safety/AI alignment narratives benefit the AI companies and their investors. The argument goes like this: Exaggerating the risks of AI exaggerates AI's capabilities. Exaggerating AI's capabilities makes the prospective financial value of AI much higher than it really is. Therefore, talking about AI risk or even AI doom is good business.

I would add that exaggerating risk may be a particularly effective way to exaggerate AI's capabilities. People tend to be skeptical of anything that sounds like pie-in-the-sky hope or optimism. On the other hand, talking about risk sounds serious and intelligent. Notice what goes unsaid: many near-term AGI believers think there's a high chance of some unbelievably amazing utopia just on the horizon. How many times have you heard someone imagine that utopia? One? Zero? And how many times have you heard various AI doom or disempowerment stories? Why would no one ever bring up this amazing utopia they think might happen very soon?

Even if you're very pessimistic and think there's a 90% chance of AI doom, a 10% chance of utopia is still pretty damn interesting. And many people are much more optimistic, thinking there's around a 1-30% chance of doom, which implies a 70%+ chance of utopia. So, what gives? Where's the utopia talk? Even when people talk about the utopian elements of AGI futures, they emphasize the worrying parts: what if intelligent machines produce effectively unlimited wealth, how will we organize the economy? What policies will we need to implement? How will people cope? We need to start worrying about this now! When I think about what would happen if I won the lottery, my mind does not go to worrying about the downsides.

I think the overwhelming majority of people who express views on this topic are true believers. I think they are sincere. I would only be willing to accuse someone of possibly doing something underhanded if, independently, they had a track record of deceptive behaviour. (Sam Altman has such a track record, and generally I don't believe anything he says anymore. I have no way of knowing what's sincere, what's a lie, and what's something he's convinced himself of because it suits him to believe it.) I think the specific accusation that AI safety/AI alignment is a deliberate, conscious lie cooked up to juice AI investment is silly. It's probably true, though, that people at AI companies have some counterintuitive incentive or bias toward talking up AI doom fears.

However, my general point is that just as it's silly to accuse AI safety/alignment people of being shills for AI companies, it also seems silly to me to say that AI companies (or "AI accelerationist" companies, which is effectively all major AI companies and almost all startups) are the equivalent of tobacco companies, and you shouldn't pay attention to what people at AI companies say about AI. Motivated reasoning accusations made on thin grounds can put you into a deluded bubble (e.g. becoming a Marxist) and I don't think AI is some clear-cut, exceptional case like tobacco or state propaganda where obviously you should ignore the message.

Load more