Four mindset disagreements behind existential risk disagreements in ML

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·3d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. Crossposted to LessWrong. ...

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·5d ago·2m read

TL;DR: Marginal Victories is a new initiative to provide 1:1 career advising, opportunities, and resources for people exploring high-leverage U.S. democracy preservation and political work. Built by impact-oriented people doing pro-democracy work, the early MVP is now up at marginalvictories.org. Fill out the 10-minute form now to receive these resources as they become available over the next few...

Recent opportunities to take action

Amsterdam Insect Protest

Bentham's Bulldog·13h ago·3m read

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·5d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·6d ago·3m read

Violet Hour

Nice post!

I think I’d want to revise your first taxonomy a bit. To me, one (perhaps the primary) disagreement among ML researchers regarding AI risk consists of differing attitudes to epistemological conservatism, which I think extends beyond making conservative predictions. Here’s why I prefer my framing:

As you note, to say that someone makes a conservative prediction comes with other connotations, like predictions being robust to uncertainty.
If I say that someone has a conservative epistemology, I think this more faithfully captures the underlying disposition — namely, that they are conservative about the power of abstract theoretical arguments to deliver strong conclusions in the absence of more straightforwardly relevant empirical data.
- I don't interpret the most conservative epistemologists as primarily driven by a fear of making ‘extreme’ predictions. Rather, I interpret them as expressing skepticism about the presence of any evidential signal offered by certain modes of more abstract argumentation.
- For example, Richard has a more conservative epistemology than you, though obviously he is highly non-conservative relative to most. David Thorstad seems more conservative still. The hard-nosed lab scientist with little patience for philosophy is yet more conservative than David.

I also think that the language of conservative epistemology helps counteract (what I see as) a mistaken frame motivating this post. (I’ll try to motivate my claim, but I’ll note that I remain a little fuzzy on exactly what I’m trying to gesture at.)

The mistaken frame I see is something like “modeling conservative epistemologists as if they were making poor strategic choices within a non-conservative world-model”. You state:

The level of concern and seriousness I see from ML researchers discussing AGI on any social media platform or in any mainstream venue seems wildly out of step with "half of us think there's a 10+% chance of our work resulting in an existential catastrophe".

I have concerns about you inferring this claim from the survey data provided,^[1] but perhaps more pertinently for my point: I think you’re implicitly interpreting the reported probabilities as something like all-things-considered credences in the proposition researchers were queried about. I’m much more tempted to interpret the probabilities offered by researchers as meaning very little. Sure, they’ll provide a number on a survey, but this doesn’t represent ‘their’ probability of an AI-induced existential catastrophe.

I don’t think that most ML researchers have, as a matter of psychological fact, any kind of mental state that’s well-represented by a subjective probability about the chance of an AI-induced existential catastrophe. They’re more likely to operate with a conservative epistemology, in a way that isn’t neatly translated into probabilistic predictions over an outcome space that includes the outcomes you are most worried about. I think many people are likely to filter out the hypothesis given the perceived lack of evidential support for the outcome.

I actually do think the distinction between 'conservative predictions' and 'conservative decision-making' is helpful, though I'm skeptical about its relevance for analyzing different attitudes to AI risk.

Here’s one place I think the distinction between ‘conservative predictions’ and ‘conservative decision-making’ would be useful: early decisions about COVID.
- Many people (including epidemiologists!) claimed that we lacked evidence about the efficacy of masks for preventing COVID, but didn’t suggest that people should wear masks anyway.
- I think ‘masks might help COVID’ would have been in the outcome space of relevant decision-makers, and so we can describe their decision-making as (overly) conservative, even given their conservative predictions.
However, I think that ‘literal extinction from AGI’ just isn’t in the outcome space of many ML researchers, because arguments for that claim become harder to make as your epistemology becomes more conservative.
- I don’t think that ‘[Person] will offer a probability when asked in a survey’ provides much evidence about whether that outcome is in [Person]’s outcome space in anything like a stable way.

If my analysis is right, then a first-pass at the practical conclusions might consist in being more willing to center arguments about alignment from a more empirically grounded perspective (e.g. here), or more directly attempting to have conversations about the costs and benefits of more conservative epistemological approaches.

^{^}
First, there are obviously selection effects present in surveying OpenAI and DeepMind researchers working on long-term AI. Citing this result without caveat feels similar using (e.g.) PhilPapers survey results revealing that most specialists in philosophy of religion are to support the claim that most philosophers are theists. I can also imagine similar selection effects being present (though to lesser degrees) in the AI Impacts Survey. Given selection effects, and given that response rates from the AI Impacts survey were ~17%, I think your claim is misleading.

Four mindset disagreements behind existential risk disagreements in ML

Four mindset disagreements behind existential risk disagreements in ML

1. "Conservative" predictions, versus conservative decision-making

2. Waiting for a fire alarm, versus intervening proactively

3. Anchoring to what's familiar, versus trying to account for potential novelties in AGI

4. Modeling existential risks in far mode, versus near mode