Microdooms averted by working on AI Safety

Nikola

Microdooms averted by working on AI Safety

Nikola

Comments

More from the author

122

Orienting to 3 year AGI timelines

Nikola·1y ago·10m read

Getting Actual Value from “Info Value”: Example from a Failed Experiment

Nikola, aL, tlevin, NickGabs·3y ago·4m read

An organic pitch for undergrads: do it when people ask what your major is

Nikola·4y ago·1m read

Curated and popular this week

Hard-to-reverse decisions destroy option value

Stefan_Schubert·9y ago·Curated 1d ago·14m read

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·2d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. Crossposted to LessWrong. ...

If you're agentic, work in biosecurity

sharmaayushmaan🔸·6d ago·7m read

Disclaimer: Although I work on the Groups Team at CEA, I’m writing this in a personal capacity, and this post does not constitute an endorsement by CEA. Agency - the realisation that you really can just do things. TL;DR Biosecurity needs people (of any background) who are agentic and have a high execution velocity and track record....

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·4d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·4d ago·3m read

Starting an EA group @ SUNY Binghamton

micahzarin·3d ago·1m read

Mo Putera

I think GiveWell shouldn’t be modeled as wanting to recommend organizations that save as many current lives as possible. I think a more accurate way to model them is “GiveWell recommends organizations that are [within the Overton Window]/[have very sound data to back impact estimates] that save as many current lives as possible.” If GiveWell wanted to recommend organizations that save as many human lives as possible, their portfolio would probably be entirely made up of AI safety orgs.

This paragraph, especially the first sentence, seems to be based on a misunderstanding I used to share, which Holden Karnofsky tried to correct back in 2011 (when he was still at GiveWell) with the blog post Why we can’t take expected value estimates literally (even when they’re unbiased) in which he argued (emphasis his):

While some people feel that GiveWell puts too much emphasis on the measurable and quantifiable, there are others who go further than we do in quantification, and justify their giving (or other) decisions based on fully explicit expected-value formulas. The latter group tends to critique us – or at least disagree with us – based on our preference for strong evidence over high apparent “expected value,” and based on the heavy role of non-formalized intuition in our decisionmaking. ...
We believe that people in this group are often making a fundamental mistake... [of] estimating the “expected value” of a donation (or other action) based solely on a fully explicit, quantified formula, many of whose inputs are guesses or very rough estimates.
We believe that any estimate along these lines needs to be adjusted using a “Bayesian prior”; that this adjustment can rarely be made (reasonably) using an explicit, formal calculation; and that most attempts to do the latter, even when they seem to be making very conservative downward adjustments to the expected value of an opportunity, are not making nearly large enough downward adjustments to be consistent with the proper Bayesian approach.
This view of ours illustrates why – while we seek to ground our recommendations in relevant facts, calculations and quantifications to the extent possible – every recommendation we make incorporates many different forms of evidence and involves a strong dose of intuition. And we generally prefer to give where we have strong evidence that donations can do a lot of good rather than where we have weak evidence that donations can do far more good – a preference that I believe is inconsistent with the approach of giving based on explicit expected-value formulas (at least those that (a) have significant room for error (b) do not incorporate Bayesian adjustments, which are very rare in these analyses and very difficult to do both formally and reasonably).

(He since developed this view further in the 2014 post Sequence thinking vs cluster thinking.) Further down, Holden wrote

My prior for charity is generally skeptical, as outlined at this post. Giving well seems conceptually quite difficult to me, and it’s been my experience over time that the more we dig on a cost-effectiveness estimate, the more unwarranted optimism we uncover.

This guiding philosophy hasn't changed; in GiveWell's How we work - criteria - cost-effectiveness they write:

Cost-effectiveness is the single most important input in our evaluation of a program's impact. However, there are many limitations to cost-effectiveness estimates, and we do not assess programs solely based on their estimated cost-effectiveness. We build cost-effectiveness models primarily because:
They help us compare programs or individual grant opportunities to others that we've funded or considered funding; and
Working on them helps us ensure that we are thinking through as many of the relevant issues as possible.

which jives with what Holden wrote in the relative advantages & disadvantages of sequence thinking vs cluster thinking article above.

Note that this is for global health & development charities, where the feedback loops to sense-check and correct cost-effectiveness analyses that guide resource allocation & decision-making are much clearer and tighter than for AI safety orgs (and other longtermist work more generally). If it's already this hard for GHD work, I get much more skeptical of CEAs in AIS with super-high EVs, just in model uncertainty terms.

This isn't meant to devalue AIS work! I think it's critical and important, and I think some of the "p(doom) modeling" work is persuasive (MTAIR, Froolow, and Carlsmith come to mind). Just thought that "If GiveWell wanted to recommend organizations that save as many human lives as possible, their portfolio would probably be entirely made up of AI safety orgs" felt off given what they're trying to do, and how they're going about it.

Microdooms averted by working on AI Safety

Microdooms averted by working on AI Safety

Diminishing returns model

Pareto distribution model

Linear growth model

One microdoom is A Lot Of Impact

Why doesn’t GiveWell recommend AI Safety organizations?