Hide table of contents

People should stop falling back on the argument that working on AI safety research is a 'Pascal's Mugging', because that doesn't address the actual reasons people who work on AI safety think we should work on AI safety today.

Most people who work on AI think the chances of affecting the outcome are not infinitesimal, but rather entirely macroscopic, the same way that voting in an election has a low but real chance of changing the outcome, or having an extra researcher has a low but real chance of causing us to invent a cure for malaria sooner, or having an extra person on ebola containment makes it less likely to become a pandemic.

For example someone involved might believe:

i) There's a 10% chance of humanity creating a 'superintelligence' within the next 100 years.

ii) There's a 30% chance that the problem can be solved if we work on it harder and earlier.

iii) A research team of five suitable people starting work on safety today and continuing through their working lives would raise the odds of solving the problem by 1% of that (0.3 percentage points). (This passes a sanity check, as they would represent a 20% increase in the effort being made today.)

iv) Collectively they therefore have a 0.03% chance of making an AI significantly more aligned with human values in the next 100 years, such that individual person involved has a 0.006 percentage point share.

Note that the case presented here has nothing to do with there being some enormous and arbitrary value available if you succeed, which is central to the weirdness of the Pascal's Mugging case.

Do you think the numbers in this calculation are way over-optimistic? OK - that's completely reasonable!

Do you think we can't predict whether the sign of the work we do now is positive or negative? Is it better to wait and work on the problem later? There are strong arguments for that as well!

But those are the arguments that should be made and substantiated with evidence and analysis, not quick dismissals that people are falling for a 'Pascal's Mugging', which they mostly are not.

Given the beliefs of this person, this is no more a Pascal's Mugging than working on any basic science research, or campaigning for an outsider political campaign, or trying to reform a political institution. These all have unknown but probably very low chances of making a breakthrough, but could nevertheless be completely reasonable things to try to do.

Here's a similar thing I wrote years ago: If elections aren't a Pascal's Mugging, existential risk work shouldn't be either.

Postscript

As far as I can see all of these are open possibilities:

1) Solving the AI safety problem will turn out to be unnecessary, and our fears today are founded on misunderstandings about the problem.

2) Solving the AI safety problem will turn out to be relative straightforward on the timeline available.

3) It will be a close call whether we manage to solve it in time - it will depend on how hard we work and when we start.

4) Solving the AI safety problem is almost impossible and we would have to be extremely lucky to do so before creating a super-intelligent machine. We are therefore probably screwed.

We collectively haven't put enough focussed work into the problem yet to have a good idea where we stand. But that's hardly a compelling reason to assume 1), 2) or 4) and not work on it now.

Comments16


Sorted by Click to highlight new comments since:

It might not be a strong response to the whole cause area, but isn't it the only response to the Bostrom-style arguments linked below? Which in my experience covers the majority of the arguments I hear in favour of x-risk.

Very few one line arguments are strong responses to whole world views that smart people actually believe, so I sort of feel like there's nothing to see here.

I asked Bostrom about this and he said he never even made this argument in this way to the journalist. Given my experience the the media misrepresenting everything you say and wanting to put sexy ideas into their pieces, I believe him.

The New Yorker writer got it straight out of this paper of Bostrom's (paragraph starting "Even if we use the most conservative of these estimates"). I've seen a couple of people report that Bostrom made a similar argument at EA Global.

Look, no doubt the argument has been made by people in the past, including Bostrom who wrote it up for consideration as a counterargument. I do think the 'astronomical waste' argument should be considered, and it's far from obvious that 'this is a Pascal's Mugging' is enough to overcome its strength.

But it's also not the main reason, only reason, or best reason, many people who work on these problems could ground their choice to do so.

So if you dismiss this argument, before you dismiss the work, move on to look at what you think is the strongest argument, not the weakest.

I actually think there's an appropriate sense in which it is the strongest argument -- not in that it's the most robust, but in that it has the strongest implications. I think this is why it gets brought up (and that it's appropriate to do so).

Agreed - despite being counterintuitive, it's not obviously a flawed argument.

If I were debating you on the topic, it would be wrong to say that you think it's a Pascal's mugging. But I read your post as being a commentary on the broader public debate over AI risk research, trying to shift it away from "tiny probability of gigantic benefit" in the way that you (and others) have tried to shift perceptions of EA as a whole or the focus of 80k. And in that broader debate, Bostrom gets cited repeatedly as the respectable, mainstream academic who puts the subject on a solid intellectual footing.

(This is in contrast to MIRI, which as SIAI was utterly woeful and which in its current incarnation still didn't look like a research institute worthy of the name when I last checked in during the great Tumblr debate of 2014; maybe they're better now, I don't know.)

In that context, you'll have to keep politely telling people that you think the case is stronger than the position your most prominent academic supporter argues from, because the "Pascal's mugging" thing isn't going to disappear from the public debate.

I have no opinion on what Bostrom did or didn't say, to be clear. I've never even spoken to him. Which is why I said 'Bostrom-style'. But I have heard this argument, in person, from many of the AI risk advocates I've spoken to.

Look, any group in any area can present a primary argument X, be met by (narrow) counterargument Y, and then say 'but Y doesn't answer our other arguments A, B, C!'. I can understand why that sequence might be frustrating if you believe A, B, C and don't personally put much weight on X, but I just feel like that's not an interesting interaction.

It seems like Rob is arguing against people using Y (the Pascal's Mugging analogy) as a general argument against working on AI safety, rather than as a narrow response to X.

Presumably we can all agree with him on that. But I'm just not sure I've seen people do this. Rob, I guess you have?

I get what you're saying, but, e.g., in the recent profile of Nick Bostrom in the New Yorker:

No matter how improbable extinction may be, Bostrom argues, its consequences are near-infinitely bad; thus, even the tiniest step toward reducing the chance that it will happen is near-­infinitely valuable. At times, he uses arithmetical sketches to illustrate this point. Imagining one of his utopian scenarios—trillions of digital minds thriving across the cosmos—he reasons that, if there is even a one-per-cent chance of this happening, the expected value of reducing an existential threat by a billionth of a billionth of one per cent would be worth a hundred billion times the value of a billion present-day lives. Put more simply: he believes that his work could dwarf the moral importance of anything else.

While the most prominent advocate in the respectable-academic part of that side of the debate is making Pascal-like arguments, there's going to be some pushback about Pascal's mugging.

I've also seen Eliezer (the person who came up with the term Pascal's mugging) give talks where he explicitly disavows this argument.

Two things:

i) I bet Bostrom thinks the odds of a collective AI safety effort of achieving its goal is better than 1%, which would itself be enough to avoid the Pascal's Mugging situation.

ii) This is a fallback position from which you can defend the work if someone thinks it almost certainly won't work. I don't think we should do that, instead we should argue that we can likely solve the problem. But I see the temptation.

iii) I don't think it's clear you should always reject a Pascal's Mugging (or if you should, it may only be because there are more promising options for enormous returns than giving it to the mugger).

Yes! Thank you for this. Pascal's Muggings have to posit paranormal/supernatural mechanisms to work. But x-risk isn't like that. Big difference which people seem to overlook. And Pascal's Muggings involve many orders of magnitude smaller chances than even the most pessimistic x-risk outlooks.

I agree with your second point but not your first. Also it's possible you mean "optimistic" in your second point: if x-risks themselves are very small, that's one way for the change in probability as a result of our actions to be very small.

I mean pessimism about the importance of x-risk research, which is more or less equivalent to optimism about the future of humanity. Similar idea.

Btw., this article series of yours convinced me of the importance of AI safety work. Thank you and good work!

Curated and popular this week
 ·  · 5m read
 · 
[Cross-posted from my Substack here] If you spend time with people trying to change the world, you’ll come to an interesting conundrum: Various advocacy groups reference previous successful social movements as to why their chosen strategy is the most important one. Yet, these groups often follow wildly different strategies from each other to achieve social change. So, which one of them is right? The answer is all of them and none of them. This is because many people use research and historical movements to justify their pre-existing beliefs about how social change happens. Simply, you can find a case study to fit most plausible theories of how social change happens. For example, the groups might say: * Repeated nonviolent disruption is the key to social change, citing the Freedom Riders from the civil rights Movement or Act Up! from the gay rights movement. * Technological progress is what drives improvements in the human condition if you consider the development of the contraceptive pill funded by Katharine McCormick. * Organising and base-building is how change happens, as inspired by Ella Baker, the NAACP or Cesar Chavez from the United Workers Movement. * Insider advocacy is the real secret of social movements – look no further than how influential the Leadership Conference on Civil Rights was in passing the Civil Rights Acts of 1960 & 1964. * Democratic participation is the backbone of social change – just look at how Ireland lifted a ban on abortion via a Citizen’s Assembly. * And so on… To paint this picture, we can see this in action below: Source: Just Stop Oil which focuses on…civil resistance and disruption Source: The Civic Power Fund which focuses on… local organising What do we take away from all this? In my mind, a few key things: 1. Many different approaches have worked in changing the world so we should be humble and not assume we are doing The Most Important Thing 2. The case studies we focus on are likely confirmation bias, where
 ·  · 1m read
 · 
Although some of the jokes are inevitably tasteless, and Zorrilla is used to set up punchlines, I enjoyed it and it will surely increase concerns and donations for shrimp. I'm not sure what impression the audience will have of EA in general.  Last week The Daily Show interviewed Rutger Bregman about his new book Moral Ambition (which includes a profile of Zorrilla and SWP). 
 ·  · 2m read
 · 
Americans, we need your help to stop a dangerous AI bill from passing the Senate. What’s going on? * The House Energy & Commerce Committee included a provision in its reconciliation bill that would ban AI regulation by state and local governments for the next 10 years. * Several states have led the way in AI regulation while Congress has dragged its heels. * Stopping state governments from regulating AI might be okay, if we could trust Congress to meaningfully regulate it instead. But we can’t. This provision would destroy state leadership on AI and pass the responsibility to a Congress that has shown little interest in seriously preventing AI danger. * If this provision passes the Senate, we could see a DECADE of inaction on AI. * This provision also violates the Byrd Rule, a Senate rule which is meant to prevent non-budget items from being included in the reconciliation bill.   What can I do? Here are 3 things you can do TODAY, in order of priority: 1. (5 minutes) Call and email both of your Senators. Tell them you oppose AI preemption, and ask them to raise a point of order that preempting state AI regulation violates the Byrd Rule. * Find your Senators here. * Here’s an example of a call:  “Hello, my name is {YOUR NAME} and I’m a resident of {YOUR STATE}. The newest budget reconciliation bill includes a 10-year ban pre-empting state AI legislation without establishing any federal guardrails. This is extremely concerning to me – leading experts warn us that AI could cause mass harm within the next few years, but this provision would prevent states from protecting their citizens from AI crises for the next decade. It also violates the Byrd Rule, since preempting state AI regulation doesn’t impact federal taxes or spending. I’d like the Senator to speak out against this provision and raise a point of order that this provision should not be included under the Byrd Rule.” See here for sample call + email temp