Hide table of contents

I was encouraged to post this here, but I don't yet have enough EA forum karma to crosspost directly!

Epistemic status: these are my own opinions on AI risk communication, based primarily on my own instincts on the subject and discussions with people less involved with rationality than myself. Communication is highly subjective and I have not rigorously A/B tested messaging. I am even less confident in the quality of my responses than in the correctness of my critique.

If they turn out to be true, these thoughts can probably be applied to all sorts of communication beyond AI risk.

Lots of work has gone into trying to explain AI risk to laypersons. Overall, I think it's been great, but there's a particular trap that I've seen people fall into a few times. I'd summarize it as simplifying and shortening the text of an argument without enough thought for the information content. It comes in three forms. One is forgetting to adapt concepts for someone with a far inferential distance; another is forgetting to filter for the important information; the third is rewording an argument so much you fail to sound like a human being at all.

I'm going to critique three examples which I think typify these:

Failure to Adapt Concepts

I got this from the summaries of AI risk arguments written by Katja Grace and Nathan Young here. I'm making the assumption that these summaries are supposed to be accessible to laypersons, since most of them seem written that way. This one stands out as not having been optimized on the concept level. This argument was below-aveage effectiveness when tested.

I expect most people's reaction to point 2 would be "I understand all those words individually, but not together". It's a huge dump of conceptual information all at once which successfully points to the concept in the mind of someone who already understands it, but is unlikely to introduce that concept to someone's mind.

Here's an attempt to do better:

  1. So far, humans have mostly developed technology by understanding the systems which the technology depends on.
  2. AI systems developed today are instead created by machine learning. This means that the computer learns to produce certain desired outputs, but humans do not tell the system how it should produce the outputs. We often have no idea how or why an AI behaves in the way that it does.
  3. Since we don't understand how or why an AI works a certain way, it could easily behave in unpredictable and unwanted ways.
  4. If the AI is powerful, then the consequences of unwanted behaviour could be catastrophic.

And here's Claude's just for fun:

  1. Up until now, humans have created new technologies by understanding how they work.
  2. The AI systems made in 2024 are different. Instead of being carefully built piece by piece, they're created by repeatedly tweaking random systems until they do what we want. This means the people who make these AIs don't fully understand how they work on the inside.
  3. When we use systems that we don't fully understand, we're more likely to run into unexpected problems or side effects.
  4. If these not-fully-understood AI systems become very powerful, any unexpected problems could potentially be really big and harmful.

I think it gets points 1 and 3 better than me, but 2 and 4 worse. Either way, I think we can improve upon the summary.

Failure to Filter Information

When you condense an argument down, you make it shorter. This is obvious. What is not always as obvious is that this means you have to throw out information to make the core point clearer. Sometimes the information that gets kept is distracting. Here's an example from a poster a friend of mine made for Pause AI:

A poster explaining how Narrow AI learns to play chess by playing chess games, but AGI invents a chess AI and uses it as a tool

When I showed this to my partner, they said "This is very confusing, it makes it look like an AGI is an AI which makes a chess AI". Making more AIs is part of what AGI could do, but it's not really the central difference between narrow AI and AGI. The core property of an AGI is being capable at lots of different tasks.

Let's try and do better, though this is difficult to explain:

Examples of narrow AIs: a chess AI, a chemistry AI, and a coding AI, captioned "Narrow AI learns to do a specific task by being trained on that task, such as playing chess or writing computer code. Narrow AI has a limited scope, so the overall risks are limited. An example of AGI doing several tasks, captioned: AGI is trained on diverse data and learns to do many differen tasks. It could plan and reason, even make more AIs. This means the risks from AGI are much larger than from narrow AI.

This one is not my best work, especially on the artistic front. It's a difficult concept to communicate! But I think this fixes the specific issue of information filtering. Narrow AI's do a single, bounded task; AGI can do a broad range of tasks.

Failure to Sound Like a Human Being

In this case, the writing is so compressed and removed from the original (complicated) concept that it breaks down and needs to be rewritten from the ground up. Here's a quick example from the same page (sorry Katja and Nathan! You're just the easiest example arguments to find, I really really do love the work you're doing). This is from the "Second Species Argument" which was of middling effectiveness, though this is a secondary example and not the core argument.

This is just ... an odd set of sentences. We get both of the previous errors for free here too. "An orangutan uses a stick to control juice" is poor information filtering: why does it matter that an orangutan can use a tool? "Should orangutans have felt save inventing humans" is an unnecessarily abstract question, why not just ask whether orangutans have benefited from the existence of humans or not.

But moreover, the whole thing is one of the strangest passages of text I've ever read! "An orangutan uses a stick to control juice, while humans ... control the orangutan" is a really abstract and uncommon use of the word "control" which makes no sense outside of deep rationalist circles, and also sounds like it was written by aliens. Here's my attempt to do better:

Chimpanzee - San Francisco Zoo & Gardens
Chimpanzees are physically stronger and more agile than humans, but because we're more intelligent, we're more powerful. We can destroy their habitats or put them in zoos. Are chimps better off because a more intelligent species than them exists?

For a start, I'd use a chimp instead of an orangutan, because they're a less weird animal and a closer relative to humans, which better makes our point. I then explain that we're dominant over chimps due to our intelligence, and give examples. Then instead of asking "should chimps have invented humans" I ask "Are chimps better off because a more intelligent species than them exists?" which doesn't introduce a weird hypothetical surrounding chimps inventing humans.

Summary

It's tempting to take the full, complicated knowledge structure you (i.e. a person in the 99.99th percentile of time spent thinking about a topic) want to express, and try and map it one-to-one onto a layperson's epistemology. Unfortunately, this isn't generally possible to do when your map of (this part of) the world has more moving parts than theirs. Often, you'll have to first convert your fine-grained models to coarse-grained ones, and filter out extraneous information before you can map the resulting simplified model onto their worldview.

A diagram attempting to explain the above paragraph visually. No novel information here for screen-reader users.
On the off chance that this diagram helps, I might as well put it in.

One trick I use is to imagine the response someone would give if I succeeded in explaining the concepts them, and then I asked them to summarize what they've learned back to me. I'm pretending to be my target audience who is passing an ideological turing test of my own views. "What would they say that would convince me they understood me?". Mileage may vary.

Comments3


Sorted by Click to highlight new comments since:

Good post ! I appreciate the way you tried to improve existing examples, it makes things easier to understand.

Thanks for that fantastic article. Both your written and pictorial explainers are far better than the originals, which quickly helped convince me of your arguments.

Good post! Also made me laugh xD 

Curated and popular this week
 ·  · 5m read
 · 
[Cross-posted from my Substack here] If you spend time with people trying to change the world, you’ll come to an interesting conundrum: Various advocacy groups reference previous successful social movements as to why their chosen strategy is the most important one. Yet, these groups often follow wildly different strategies from each other to achieve social change. So, which one of them is right? The answer is all of them and none of them. This is because many people use research and historical movements to justify their pre-existing beliefs about how social change happens. Simply, you can find a case study to fit most plausible theories of how social change happens. For example, the groups might say: * Repeated nonviolent disruption is the key to social change, citing the Freedom Riders from the civil rights Movement or Act Up! from the gay rights movement. * Technological progress is what drives improvements in the human condition if you consider the development of the contraceptive pill funded by Katharine McCormick. * Organising and base-building is how change happens, as inspired by Ella Baker, the NAACP or Cesar Chavez from the United Workers Movement. * Insider advocacy is the real secret of social movements – look no further than how influential the Leadership Conference on Civil Rights was in passing the Civil Rights Acts of 1960 & 1964. * Democratic participation is the backbone of social change – just look at how Ireland lifted a ban on abortion via a Citizen’s Assembly. * And so on… To paint this picture, we can see this in action below: Source: Just Stop Oil which focuses on…civil resistance and disruption Source: The Civic Power Fund which focuses on… local organising What do we take away from all this? In my mind, a few key things: 1. Many different approaches have worked in changing the world so we should be humble and not assume we are doing The Most Important Thing 2. The case studies we focus on are likely confirmation bias, where
 ·  · 1m read
 · 
Are you looking for a project where you could substantially improve indoor air quality, with benefits both to general health and reducing pandemic risk? I've written a bunch about air purifiers over the past few years, and its frustrating how bad commercial market is. The most glaring problem is the widespread use of HEPA filters. These are very effective filters that, unavoidably, offer significant resistance to air flow. HEPA is a great option for filtering air in single pass, such as with an outdoor air intake or a biosafety cabinet, but it's the wrong set of tradeoffs for cleaning the air that's already in the room. Air passing through a HEPA filter removes 99.97% of particles, but then it's mixed back in with the rest of the room air. If you can instead remove 99% of particles from 2% more air, or 90% from 15% more air, you're delivering more clean air. We should compare in-room purifiers on their Clean Air Delivery Rate (CADR), not whether the filters are HEPA. Next is noise. Let's say you do know that CADR is what counts, and you go looking at purifiers. You've decided you need 250 CFM, and you get something that says it can do that. Except once it's set up in the room it's too noisy and you end up running it on low, getting just 75 CFM. Everywhere I go I see purifiers that are either set too low to achieve much or are just switched off. High CADR with low noise is critical. Then consider filter replacement. There's a competitive market for standardized filters, where most HVAC systems use one of a small number of filter sizes. Air purifiers, though, just about always use their own custom filters. Some of this is the mistaken insistence on HEPA filters, but I suspect there's also a "cheap razors, expensive blades" component where manufacturers make their real money on consumables. Then there's placement. Manufacturers put the buttons on the top and send air upwards, because they're designing them to sit on the floor. But a purifier on the floor takes up
 ·  · 4m read
 · 
[Note: I (the primary author) am writing this entirely in a personal capacity. Funding for the bounty and donations mentioned in this post comes entirely from personal savings and the generosity of friends and family. Colleagues at Open Philanthropy (my employer) reviewed this post at my request, but this project is completely unaffiliated with Open Philanthropy.]   In 2023, GiveWell reported that it received over $250M from more than 30,000 donors, excluding Open Philanthropy. I expect (though haven’t confirmed) that at least $50M of this came from unmatched retail donations, meaning from individuals who don’t work at a company that offers a donation match. I can’t help but hope that there may be some way to offer these donors a source of matching funds that wouldn’t otherwise go toward charitable causes. Over the last couple of years, friends and I have spent >100 hours looking into potential legal, collaborative corporate donation matching opportunities. I think there may be promising ways to partner with corporate donors, but I haven’t found a way forward that I am comfortable with, and I don’t think I’m the best person to continue work on this project. Some donors may be choosing to give through surrogates (friends who work at companies that match donations) without understanding the risks involved. My understanding is that there can be several (particularly if donors send surrogates money conditionally, e.g., by asking them to sign an agreement to give through their company’s match): * The surrogate might inadvertently violate their company’s terms for donation matching. * The surrogate, donor, or company might fail an IRS audit if they don’t correctly report the donations + match. * The surrogate or donor might be sued by the company. * The company might discontinue its matching program and/or claw back funds from recipient nonprofits. “Getting to yes” with a corporate partner in a completely legal, transparent, and good faith way could direct signi