Hide table of contents
by [anonymous]
3 min read 11

69

1

Many readers will be familiar with Peter Singer’s Drowning Child experiment:

On your way to work, you pass a small pond. On hot days, children sometimes play in the pond, which is only about knee-deep. The weather’s cool today, though, and the hour is early, so you are surprised to see a child splashing about in the pond.

As you get closer, you see that it is a very young child, just a toddler, who is flailing about, unable to stay upright or walk out of the pond. You look for the parents or babysitter, but there is no one else around. 

The child is unable to keep her head above the water for more than a few seconds at a time. If you don’t wade in and pull her out, she seems likely to drown. Wading in is easy and safe, but you will ruin the new shoes you bought only a few days ago, and get your suit wet and muddy. By the time you hand the child over to someone responsible for her, and change your clothes, you’ll be late for work. 

What should you do?

Olivia recently thought of a new version of the thought experiment.

As you get closer, you see that it is a very young child, just a toddler, who is flailing about, unable to stay upright or walk out of the pond. You look for the parents or babysitter, and they are around. There are also many lifeguards around. 

At first, you breathe a sigh of relief. But then, you notice something strange: no one is moving. The child continues to drown.

The qualified experts don’t even seem to be noticing. You try to grab their attention—you scream at the lifeguards. But they don’t move.

You’ve never pulled a child out of the water. You’re not even sure you could save the child. Surely, this should be the responsibility of someone more capable than you.

But the lifeguards remain still. 

What should you do?

2

People often ask us for advice as they consider next steps in their careers. Sometimes, we suggest ambitious things that go beyond someone’s default action space.

For example, we might ask Alice, a college junior, if she has considered trying to solve the alignment problem from first principles, found a new organization in an area unrelated to her major, or do community-building in a part of the world that she hasn’t visited.

Alice responds, Wait, why would do that? There must be people way more qualified than me—isn’t this their responsibility?

--- 

The issue is not that Alice has decided against any of these options. Any of these options might be an awful fit. The issue is that she doesn’t seriously consider them. The issue is she has not given herself permission to seriously evaluate them. She assumes that someone else more qualified than her is going to do it.

In EA, that’s often just not the case. It’s good to check, of course. Sometimes, there are competent teams who are taking care of things. 

But sometimes, there is not an imaginary team with years of experience that is coming to save us. There is just us. We either build the refuges, or they don’t get built. We either figure out how to align the AI, or we don’t.

Sometimes, there is no plan that works. Sometimes, there is no standard job or internship that you can slot into to ensure a bright future.

3

Before jumping into the pond, you should be mindful of some caveats.

Sometimes, there are competent lifeguards. There are people more capable than you who will do the job. 

Sometimes, there is a good reason why the lifeguards aren’t intervening. Maybe the child isn’t actually drowning. Or maybe jumping into the pond would make the child drown faster. 

Sometimes, jumping in means other lifeguards are less likely to dive in. They might assume that you have the situation under control. They might not want to step on your toes. 

Try to notice these situations. Be mindful of alternative hypotheses for why no one seems to be jumping in. And be on the lookout for ways to mitigate the downsides.

But sometimes, the child is drowning, and the lifeguards aren’t going to save them. 

Either you will save them, or they will drown.

Comments11


Sorted by Click to highlight new comments since:

This is a great story! Good motivational content.

But I do think, in general, a mindset of "only I can do this" is innacurate and has costs. There are plenty of other people in the world, and other communities in the world, attempting to do good, and often succeeding. I think EAs have been a small fraction of the success in reducing global poverty over the last few decades, for example.

Here are a few plausible costs to me:

  • Knowing when and why others will do things significantly changes estimates of the marginal value of acting. For example, if you are starting a new project, it's reasonably likely that even if you have a completely new idea, other people will be in similar epistemic situations as you, and will soon stumble upon the same idea. So to estimate your counterfactual impact you might want to be estimating how much earlier something will occur because you made it occur, rather than purely the impact of the thing occurring. More generally, neglectedness is a key part of estimating your marginal impact - and estimating neglectedness relies heavily on an understanding of what others are focusing on, and usually at least a few people are doing things in a similar space to you.

  • Also, knowing when and why others will do things affects strategic considerations. The fact that in many places we now try to do good there are few non-EAs working there is a result of our attempts to find neglected areas. But - especially in the case of x-risk - we can expect others to begin to do good work in these areas as time progresses (see e.g. AI discussions around warning shots). The extent to which this is the case affects what is valuable to do now.

I really like these nuances. I think one of the problems with the drowning child parable / early EA thinking more generally was (and still is, to a large extent) very focused on the actions of the individual. 

It's definitely easier and more accurate to model individual behavior, but I think we (as a community) could do more to improve our models of group behavior even though it's more difficult and costly to do so. 

Minor, but 

Many readers will be familiar with Peter Singer’s Drowning Child experiment:

Should be 

Peter Singer's Drowning Child thought experiment.  

A  "Drowning Child experiment" will be substantially more concerning

This comment co-written with Jake McKinnon:

The post seems obviously true when the lifeguards are the general experts and authorities, who just tend not to see or care about the drowning children at all. It's more ambiguous when the lifeguards are highly-regarded EAs.

  • It's super important to try to get EAs to be more agentic and skeptical that more established people "have things under control." In my model, the median EA is probably too deferential and should be nudged in the direction of "go save the children even though the lifeguards are ignoring them." People need to be building their own models (even if they start by copying someone else's model, which is better than copying their outputs!) so they can identify the cases where the lifeguards are messing up.
  • However, sometimes the lifeguards aren't saving the children because the water is full of alligators or something. Like, lots of the initial ideas that very early EAs have about how to save the child are in fact ignorant about the nature of the problem (a common one is a version of "let's just build the aligned AI first"). If people overcorrect to "the lifeguards aren't doing anything," then when the lifeguards tell them why their idea is dangerous, they'll ignore them.

The synthesis here is something like: it's very important that you understand why the lifeguards aren't saving the children. Sometimes it's because they're missing key information, not personally well-suited to the task, exhausted from saving other children, or making a prioritization/judgment error in a way that you have some reason to think your judgment is better. But sometimes it's the alligators! Most ideas for solving problems are bad, so your prior should be that if you have an idea, and it's not being tried, probably the idea is bad; if you have inside-view reasons to think that it's good, you should talk to the lifeguards to see if they've already considered this or think you will do harm.

Finally, it's worth noting that even when the lifeguards are competent and correctly prioritizing, sometimes the job is just too hard for them to succeed with their current capabilities. Lots of top EAs are already working on AI alignment in not-obviously-misguided ways, but it turns out that it's a very very very hard problem, and we need more great lifeguards! (This is not saying that you need to go to "lifeguard school," i.e. getting the standard credentials and experiences before you start actually helping, but probably the way to start helping the lifeguards involves learning what the lifeguards think by reading them or talking to them so you can better understand how to help.)

Ruby
11
0
0

Good comment!!


Most ideas for solving problems are bad, so your prior should be that if you have an idea, and it's not being tried, probably the idea is bad;


A key thing here is to be able to accurately judge whether the idea would be harmful if tried or not. "Prior is bad idea != EV is negative". If the idea is a random research direction, probably won't hurt anyone if you try it. On the other hand, for example, certain kinds of community coordination attempts deplete a common resource and interfere with other attempts, so the fact no one else is acting is a reason to hesitate.

Going to people who you think maybe ought to be acting and asking them why they're not doing a thing is probably a thing that should be encouraged and welcomed? I expect in most cases the answer will be "lack of time" rather than anything more substantial. 

In terms of thinking about why solutions haven't been attempted, I'll plug Inadequate Equilibria. Though it probably provides a better explanation for why problems in the broader world haven't been addressed. I don't think the EA world is yet in an equilibrium and so things don't get done because {it's genuinely a bad idea, it seems like the thing you shouldn't be unilateral on and no one has built consensus,  sheer lack of time}.

EA groups often get criticized by university students for "not doing anything." The answer usually given (which I think is mostly correct!) is that the vast majority of your impact will come from your career, and university is about gaining the skills you need to be able to do that. I usually say that EA will help you make an impact throughout your life, including after you leave college; the actions people usually think of as "doing things" in college (like volunteering), though they may be admirable,  don't.

Which is why I find it strange that the post doesn't mention the possibility of becoming a lifeguard.

In this story, the lifeguards aren't noticing. Maybe they're complacent. Maybe they don't care about their jobs very much. Maybe they just aren't very good at noticing.  Maybe they aren't actually lifeguards at all, and they just pretend to be lifeguards. Maybe the entire concept of "lifeguarding" is just a farce.

But if it's really just that they aren't noticing, and you are noticing, you should think about whether it really makes sense to jump into the water and start saving children. Yes, the children are drowning, but no, you aren't qualified to save them. You don't know how to swim that well, you don't know how to carry children out of the water, and you certainly don't know how to do CPR. If you really want to save lives, go get some lifeguard training and come back and save far more children.

But maybe the children are dying now, and this is the only time they're dying, so once you become a lifeguard it will be too late to do anything. Then go try saving children now!

Or maybe going to lifeguard school will destroy your ability to notice drowning children. In that case, maybe you should try to invent lifeguarding from scratch.

But unless all expertise is useless and worthless, which it might be in some cases, it's at least worth considering whether you should be focused on becoming a good lifeguard.

Thanks for this post! I always appreciate a pretty metaphor, and  I generally agree that junior EAs should be less deferential and more ambitious. Maybe most readers will in fact mostly take away the healthy lesson of "don't defer", which would be great! But I worry a bit about the urgent tone of "act now, it's all on you", which I think can lead in some unhealthy directions. 

To me, it felt like a missing mood within the piece was concern for the reader's well-being.  The concept of heroic responsibility is in some ways very beautiful and important to me, but I worry that it can very easily mess people up more than it causes them to do good. (Do heroic responsibility responsibly, kids.)  

When you feel like there are no lifeguards, and drowning children are everywhere, it's easy to exhaust yourself before you even get to the point of saving anyone at all. I've seen of people burn themselves out over projects that, while promising, were really not organized with their sustainable well-being in mind.    

If I were to write a version of this piece that reflected my approach to doing good, maybe I'd try to find a different metaphor that framed it more as an iterated game, to make it more natural to say something about conserving your strength / nurturing yourself / marathon-not-a-sprint. 

Some other comments I particularly resonated with: @levin's point about negative side effects due to unilateralist  uninformed action,  and @VaidehiAgarwalla's point about implicitly reflecting an Eliezerish view of AI risk. I think the latter is part of what triggered my worry about this post potentially crushing people under the weight of responsibility.

A comment from a friend (I've paraphrased a bit): 

In this post two things stand out: 

  1. This advice seems to be particularly targeted at college students / undergraduates / people early in their careers (based on Section 2) and I expect many undergraduates might read this post.
  2. Your post links to 2 articles from Eliezer Yudkowsky's / MIRI's perspective of AI alignment, which is a (but importantly, not the only) perspective of alignment research that is particularly dire. Also, several people working on alignment do in fact have plans (link to vanessa kosoy), even if they are skeptical they will work. 

The way that these articles are linked assumes they are an accepted view or presents them in a fairly unnuanced way which seems concerning, especially coupled with the framing of "we have to save the world" (which Benjamin Hilton has commented on).

How much should you do ‘off your own bat‘ (to use the British cricket idiom)?  Well, most value  comes from people working in their roles, or from working with others to create change, but sometimes there are opportunities that would be missed without an individual going out on a limb.

The real problem is that in large scale problems like AI safety, progress is usually continuous, not discrete. This we can talk about partial alignment problems, which realistically is the best EA/LessWrong can do. I don't expect them to ever be able to get AI to be particularly moral or not destabilize society, but existential catastrophe is likely to be avoided.

Also, I'm going to steal part of Vaidehi Agarwalla's comment and improve upon it here:

Your post links to 2 articles from Eliezer Yudkowsky's / MIRI's perspective of AI alignment, which is a (but importantly, not the only) perspective of alignment research that is an outlier in it's direness. We have good reason to believe that this caused by unnecessary discreteness in their framing of the AI Alignment problem.

Curated and popular this week
 ·  · 16m read
 · 
This is a crosspost for The Case for Insect Consciousness by Bob Fischer, which was originally published on Asterisk in January 2025. [Subtitle.] The evidence that insects feel pain is mounting, however we approach the issue. For years, I was on the fence about the possibility of insects feeling pain — sometimes, I defended the hypothesis;[1] more often, I argued against it.[2] Then, in 2021, I started working on the puzzle of how to compare pain intensity across species. If a human and a pig are suffering as much as each one can, are they suffering the same amount? Or is the human’s pain worse? When my colleagues and I looked at several species, investigating both the probability of pain and its relative intensity,[3] we found something unexpected: on both scores, insects aren’t that different from many other animals.  Around the same time, I started working with an entomologist with a background in neuroscience. She helped me appreciate the weaknesses of the arguments against insect pain. (For instance, people make a big deal of stories about praying mantises mating while being eaten; they ignore how often male mantises fight fiercely to avoid being devoured.) The more I studied the science of sentience, the less confident I became about any theory that would let us rule insect sentience out.  I’m a philosopher, and philosophers pride themselves on following arguments wherever they lead. But we all have our limits, and I worry, quite sincerely, that I’ve been too willing to give insects the benefit of the doubt. I’ve been troubled by what we do to farmed animals for my entire adult life, whereas it’s hard to feel much for flies. Still, I find the argument for insect pain persuasive enough to devote a lot of my time to insect welfare research. In brief, the apparent evidence for the capacity of insects to feel pain is uncomfortably strong.[4] We could dismiss it if we had a consensus-commanding theory of sentience that explained why the apparent evidence is ir
 ·  · 7m read
 · 
Introduction I have been writing posts critical of mainstream EA narratives about AI capabilities and timelines for many years now. Compared to the situation when I wrote my posts in 2018 or 2020, LLMs now dominate the discussion, and timelines have also shrunk enormously. The ‘mainstream view’ within EA now appears to be that human-level AI will be arriving by 2030, even as early as 2027. This view has been articulated by 80,000 Hours, on the forum (though see this excellent piece excellent piece arguing against short timelines), and in the highly engaging science fiction scenario of AI 2027. While my article piece is directed generally against all such short-horizon views, I will focus on responding to relevant portions of the article ‘Preparing for the Intelligence Explosion’ by Will MacAskill and Fin Moorhouse.  Rates of Growth The authors summarise their argument as follows: > Currently, total global research effort grows slowly, increasing at less than 5% per year. But total AI cognitive labour is growing more than 500x faster than total human cognitive labour, and this seems likely to remain true up to and beyond the point where the cognitive capabilities of AI surpasses all humans. So, once total AI cognitive labour starts to rival total human cognitive labour, the growth rate of overall cognitive labour will increase massively. That will drive faster technological progress. MacAskill and Moorhouse argue that increases in training compute, inference compute and algorithmic efficiency have been increasing at a rate of 25 times per year, compared to the number of human researchers which increases 0.04 times per year, hence the 500x faster rate of growth. This is an inapt comparison, because in the calculation the capabilities of ‘AI researchers’ are based on their access to compute and other performance improvements, while no such adjustment is made for human researchers, who also have access to more compute and other productivity enhancements each year.
 ·  · 40m read
 · 
I am Jason Green-Lowe, the executive director of the Center for AI Policy (CAIP). Our mission is to directly convince Congress to pass strong AI safety legislation. As I explain in some detail in this post, I think our organization has been doing extremely important work, and that we’ve been doing well at it. Unfortunately, we have been unable to get funding from traditional donors to continue our operations. If we don’t get more funding in the next 30 days, we will have to shut down, which will damage our relationships with Congress and make it harder for future advocates to get traction on AI governance. In this post, I explain what we’ve been doing, why I think it’s valuable, and how your donations could help.  This is the first post in what I expect will be a 3-part series. The first post focuses on CAIP’s particular need for funding. The second post will lay out a more general case for why effective altruists and others who worry about AI safety should spend more money on advocacy and less money on research – even if you don’t think my organization in particular deserves any more funding, you might be convinced that it’s a priority to make sure other advocates get more funding. The third post will take a look at some institutional problems that might be part of why our movement has been systematically underfunding advocacy and offer suggestions about how to correct those problems. OUR MISSION AND STRATEGY The Center for AI Policy’s mission is to directly and openly urge the US Congress to pass strong AI safety legislation. By “strong AI safety legislation,” we mean laws that will significantly change AI developers’ incentives and make them less likely to develop or deploy extremely dangerous AI models. The particular dangers we are most worried about are (a) bioweapons, (b) intelligence explosions, and (c) gradual disempowerment. Most AI models do not significantly increase these risks, and so we advocate for narrowly-targeted laws that would focus their att