MIRI 2024 Mission and Strategy Update

Malo

MIRI 2024 Mission and Strategy Update

Malo

9 min readJan 5, 2024

154

Comments 36

Sorted by

New & upvoted

Geoffrey Miller

Malo - bravo on this pivot in MIRI's strategy and priorities. Honestly it's what I've hoped MIRI would do for a while. It seems rational, timely, humble, and very useful! I'm excited about this.

I agree that we're very unlikely to solve 'technical alignment' challenges fast enough to keep AI safe, given the breakneck rate of progress in AI capabilities. If we can't speed up alignment work, we have to slow down capabilities work.

I guess the big organizational challenge for MIRI will be whether its current staff, who may have been recruited largely for their technical AI knowledge, general rationality, and optimism about solving alignment, can pivot towards this more policy-focused and outreach-focused agenda -- which may require quite different skill sets.

Let me know if there's anything I can do to help, and best of luck with this new strategy!

NickLaing

I appreciate the impressive epistemic humility it must have taken for one of the original and most prestigious alignment research orgs to decide that right now prioritising policy and communications work over research might be the best course to follow. I would imagine that might be a somewhat painful decision for technical people who have devoted their life to finding a technical solution. Nice one!

"Although we plan to pursue all three of these priorities, it’s likely that policy and communications will be a higher priority for MIRI than research going forward."

maxime

Could you unpack (1) how you plan to work towards "Increase the probability that the major governments of the world end up coming to some international agreement" and (2) how confident you are a foundational research org can transition into and make a difference in the policy space?

Larks

I assumed Nick was being sincere?

NickLaing

Yes I was being sincere. I might have missed some meta thing here as obviously I'm not steeped in AI alignment. Perhaps Trevor intended to reply on another comment but mistakenly replied here?

Will Aldred

I’m curious, since it sounds like MIRI folks may have thought about this, if you have takes on how best to allocate marginal effort between pushing for cooperation-to-halt-AI-progress on the one hand, and accelerating cognitive enhancement (e.g., mind uploading) on the other?^[1]

Like, I see that you list promoting cooperation as a priority, but to me, based on your footnote 3, it doesn’t seem obvious that promoting cooperation to buy ourselves time is a better strategy at the margin than simply working on mind uploading.^[2] (At least, I don’t see this being obviously true for people-trying-to-reduce-AI-risk at large, and I’d be interested in your—or others’—thoughts here, in case there’s something I’m missing. It may well be clearly true for MIRI given your comparative advantages; I’m asking this question from the perspective of overall AI risk reduction strategy.) Here’s that footnote 3:

Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.

Related recent discussion:

“Does davidad's uploading moonshot work?”
- Context: David Dalrymple (aka davidad) recently outlined a concrete plan for mind uploading by 2040.

Related prediction markets:

Eliezer’s Manifold market, “If Artificial General Intelligence has an okay outcome, what will be the reason?”
- At present, the leading answer is: “Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.”
My Metaculus question, “Will mind uploading happen before AGI?”
- The current community prediction is 1%.^[3]

^{^}
ETA: I’ve just noticed that earlier today, another Forum user posted a quick take on a similar theme, asking why there’s been no EA funding for cognitive enhancement projects. See here.
^{^}
The immediate lines of reasoning I can think of for why “put all marginal effort towards pausing AI” is the best strategy right now are: i) uploading is intractable given AGI timelines, and ii) future, just-before-the-pause models—GPT-7, say—could help significantly with mind uploading R&D. But then, assuming that uploading is our best bet for getting alignment right, I think ii just shifts the discussion to things like “where is the best place to pause (with respect to the tradeoff between powerful automation of uploading R&D versus not pausing too late)?” and “are there ways to push for differential progress in models’ capabilities? (e.g., narrow superhuman ability in neuroscience research).”
What’s more, as counters to i: Firstly, most problems fall within a 100x tractability range. Secondly, even if cooperation+pause efforts are clearly higher impact right now than object-level uploading work, I think there’s still the argument that field-building for mind uploading should start now, rather than once the pause is in place. Because if field-building starts now, then with luck there’ll be a body of uploading researchers ready to make the most of a future pause. (This argument doesn’t go through if the pause lasts indefinitely, because in that case there’s time to build up the mind uploading field from scratch in the pause. But it does go through if the pause is limited or fragile, which I tentatively believe are more likely possibilities. See also Scott Alexander’s taxonomy of AI pauses.)
^{^}
Taken together, these two prediction markets arguably paint a grim picture. Namely, the trades on Eliezer’s question imply that mind uploading is the most likely way that AGI goes well for humanity, but the forecasts on my question imply that we’re very unlikely to get mind uploading before AGI.

Geoffrey Miller

Will - we seem to be many decades away from being able to do 'mind uploading' or serious levels of cognitive enhancement, but we're probably only a few years away from extremely dangerous AI.

I don't think that betting on mind uploading or cognitive enhancement is a winning strategy, compared to pausing, heavily regulating, and morally stigmatizing AI development.

(Yes, given a few generations of iterated embryo selection for cognitive ability, we could probably breed much smarter people within a century or two. But they'd still run a million times slower than machine intelligences. As for mind uploading, we have nowhere near the brain imaging abilities required to do whole-brain emulations of the sort envisioned by Robin Hanson)

Hayven Frienby

Agreed, but as I said earlier, acceptance seems to be the answer. We are limited, biological beings, who aren't capable of understanding everything about ourselves or the universe. We're animals. I understand this leads to anxiety and disquiet for a lot of people. Recognizing the danger of AI and the impossibility of transhumanism and mind uploading, I think the best possible path forward is to just accept our limited state, rationally stagnate our technology, and focus on social harmony and environmental protection as the way forward.

As for the despair this could cause to some, I'm not sure what the answer is. EA has taken a lot of its organizational structure and methods of moral encouragement from philosophies like Confucianism, religions, universities, etc. Maybe an EA-led philosophical research project into human ultimate hope (in the absence of techno-salvation) would be fruitful.

Geoffrey Miller

Hayven - there's a huge, huge middle ground between reckless e/acc ASI accelerationism on the one hand, and stagnation on the other hand.

I can imagine a moratorium on further AGI research that still allows awesome progress on all kinds of wonderful technologies such as longevity, (local) space colonization, geoengineering, etc -- none of which require AGI.

Hayven Frienby

We can certainly research those things, but using purely human efforts (no AI) progress will likely take many decades to see even modest gains. From a longtermist perspective that's not a problem of course, but it's a difficult thing to sell to someone not excited about living what is essentially a 20th century life so we can make progress long after they are gone. A ban on AI should come with a cultural shift toward a much less individualistic, less present-oriented value set.

Greg_Colbourn ⏸️

I think there is an unstated assumption here that uploading is safe. And by safe, I mean existentially safe for humanity^[1]. If in addition to being uploaded, a human is uplifted to superintelligence, would they -- indeed any given human in such a state -- be aligned enough with humanity as a whole to not cause an existential disaster? Arguably humans right now are only relatively existentially safe because power imbalances between them are limited.

Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). "Whoops, I didn't expect that to happen from my little physics experiment"; "Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)".

^{^}
Although safety for the individual being uploaded would be far from guaranteed either.

Michael St Jules 🔸

We could upload many minds, trying to represent some (sub)distribution of human values (EDIT: and psychological traits), and augment them all slowly, limiting power imbalances between them along the way.

Greg_Colbourn ⏸️

Perhaps. But remember they will be smarter than us, so controlling them might not be so easy (especially if they gain access to enough computer power to speed themselves up massively. And they need not be hostile, just curious, to accidentally doom us.)

Will Aldred

Yes, this is a fair point; Holden has discussed these dangers a little in “Digital People Would Be An Even Bigger Deal”. My bottom-line belief, though, is that mind uploads are still significantly more likely to be safe than ML-derived ASI, since uploaded minds would presumably work, and act, much more similarly to (biological) human minds. My impression is that others also hold this view? I’d be interested if you disagree.

To be clear, I rank moratorium > mind uploads > ML-derived ASI, but I think it’s plausible that our strategy portfolio should include mind uploading R&D alongside pushing for a moratorium.

Greg_Colbourn ⏸️

I agree that they would most likely be safer than ML-derived ASI. What I'm saying is that they still won't be safe enough to prevent an existential catastrophe. It might buy us a bit more time (if uploads happen before ASI), but that might only be measured in years. Moratorium >> mind uploads > ML-derived ASI.

Michael St Jules 🔸

Why do you expect an existential catastrophe from augmented mind uploads?

Greg_Colbourn ⏸️

Because of the crazy high power differential, and propensity for accidents (can a human really not mess up on an existential scale if acting for millions of years subjectively at superhuman capability levels?). As I say in my comment above:

Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). "Whoops, I didn't expect that to happen from my little physics experiment"; "Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)".

Michael St Jules 🔸

This doesn’t seem like a strong enough argument to justify a high probability of existential catastrophe (if that's what you intended?).

At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection (not that this would converge to some moral truth; I don’t think there is any).

If you think this has a low chance of success (if we could delay AGI long enough to actually do it), then alignment seems pretty hopeless to me on that view, and a temporary pause only delays the inevitable doom.

I do think we could do better (for upside-focused views) by ensuring more value pluralism and preventing particular values from dominating, e.g. by uploading and augmenting multiple minds.

Greg_Colbourn ⏸️

At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection

They are still human though, and humans are famous for making mistakes, even the most intelligent and rational of us. It's even regarded by many as part of what being human is - being fallible. That's not (too much of) a problem at current power differentials, but it is when we're talking of solar-system-rearranging powers for millions of subjective years without catastrophic error...

a temporary pause only delays the inevitable doom.

Yes. The pause should be indefinite, or at least until global consensus to proceed, with democratic acceptance of whatever risk remains.

Hayven Frienby

Thank you for this well-sourced comment. I'm not affiliated with MIRI, so I can't answer the questions directed to the OP. With that said, I did have a small question to ask you. What would be your issue with simply accepting human fragility and limits? Does the fact that we don't and can't know everything, live no more than a century, and are at risk for disease and early death mean that we should fundamentally alter our nature?

I think the best antidote to the present moment's dangerous dance with AI isn't mind uploading or transhumanism, but acceptance. We can accept that we are animals, that we will not live forever, and that any ultimate bliss or salvation won't come via silicon. We can design policies that ensure these principles are always upheld.

Guy Raveh

How does the choice to publish MIRI's main views as LessWrong posts rather than, say, articles in peer-reviewed journals or more pieces in the media, square with the need to convince a much broader audience (including decision-makers in particular)?

RobertM

There is no button you can press on demand to publish an article in either a peer-reviewed journal or a mainstream media outlet.

Publishing pieces in the media (with minimal 3rd-party editing) is at least tractable on the scale of weeks, if you have a friendly journalist. The academic game is one to two orders of magnitude slower than that. If you want to communicate your views in real-time, you need to stick to platforms which allow that.

I do think media comms is a complementary strategy to direct comms (which MIRI has been using, to some degree). But it's difficult to escape the fact that information posted on LW, the EA forum, or Twitter (by certain accounts) makes its way down the grapevine to relevant decision-makers surprisingly often, given how little overhead is involved.

titotal

But it's difficult to escape the fact that information posted on LW, the EA forum, or Twitter (by certain accounts) makes its way down the grapevine to relevant decision-makers surprisingly often, given how little overhead is involved.

This isn't necessarily a good thing, if the information being passed down is flawed or incorrect, due to the lack of rigor involved.

The judges of quality for peer reviewed papers are domain level experts who contribute their relevant expertise. The judges of quality for blog posts are a collection of random people on the internet, often few of which have relevant expertise and who are often unable to distinguish between actual truth and convincing sounding BS.

The ideal situation would be to write peer reviewed papers and then communicate their results on blogs, but this won't be a good fit for a lot of things, given that some fields are not well established and some points are too small or obvious to be worth writing up academically.

Guy Raveh

Publishing pieces in the media (with minimal 3rd-party editing) is at least tractable on the scale of weeks, if you have a friendly journalist. The academic game is one to two orders of magnitude slower than that.

Given that MIRI has held these views for decades, I don't quite see how the timeline for academic publication is of issue here.

Malo

We’ve also been doing media and we’re working on building capacity and gaining expertise to do more of it more effectively.

Publishing research in more traditional venues is also something we’ve been chatting about internally.

Siebe

I just want to share that I think you did an excellent job explaining the arguments on the recent Politico Tech podcast, in a way that I think comes across as very grounded and reasonable, which makes me more optimistic that MIRI can make this shift. I also hope that you can nudge Eliezer more towards this style of communication, which I think would make his audience more receptive. (I thought the tone of the TIME piece didn't seem professional enough). This seems especially important if Eliezer will also focus on communications and policy instead of research.

titotal

I'm confused. The comment reads as sincere to me? What part of it did you think was a joke?

Otto

Congratulations on a great prioritization!

Perhaps the research that we (Existential Risk Observatory) and others (e.g. @Nik Samoylov, @KoenSchoen) have done on effectively communicating AI xrisk, could be something to build on. Here's our first paper and three blog posts (the second includes measurement of Eliezer's TIME article effectiveness - its numbers are actually pretty good!). We're currently working on a base rate public awareness update and further research.

Best of luck and we'd love to cooperate!

Prometheus

Judging from all the comments in agreement, from people who probably have no political power to actually implement these things, but who might have been useful toward actually solving the problem, this pivot is probably a net negative. You will probably fail at having much of a political influence, but succeed at dissuading people from doing technical research.

Hayven Frienby

This is why I don't think the goal should be to grow the movement. Movements that grow by seeking converts usually end up drifting far from their original mission and taking on negative, irrational aspects of the societies they emerge from. Religious and political history provide dozens of examples of this process taking place.

EA should be about quality over quantity just in my opinion, and "social status" is both figuratively and literally worthless in the face of extinction.

Hayven Frienby

I fully agree with the shift away from research and toward policy. With how close we are to what you termed smarter-than-human AI (also called AGI or ASI, but your term is much more precise, so I'll use it going forward), research is not where efforts are best placed. We could be looking at a human extinction scenario (or an equally bad outcome, such as the permanent limiting of human potential) within 5-20 years. That's an emergency situation as far as I'm considered. Once the necessary laws and procedures are in place, research an continue.

I can't speak for all EAs, but my ultimate goal is to see a world without smarter-than-human AI until humanity outgrows its tendencies to wage war and seek personal gain over the flourishing of all sentient beings. This would likely place ASI development somewhere between 100 years AP* and never, and probably closer to the "never" end of that timescale. This is something we have to accept--especially those of us with tech-loving tendencies.

I'm under no illusions that Silicon Valley would ever accept this, but in a democratic society they aren't the ones calling the shots. A democratic government can ban agents / generative / smarter-than-human AI, and the actors I mentioned previously would simply have to accept it. We need the US, EU, Canada, Taiwan, and Japan to adopt MIRI guidelines on AI safety, security, and non-proliferation--and these conversations must begin at the local level.

If we are looking to shift the Overton window, we have to target our communications toward "ordinary people" and policymakers, not tech geeks and data wonks. This will be my top priority going forward, along with animal welfare activism.

*AP = after present

Evan_Gaensbauer

-3

I appreciate the pivot to a better-devised and merely pessimistic strategy on MIRI's part, as opposed to a deceptively dignified and misrepresentative resignation to death.

RobBensinger

Every aspect of that summary of how MIRI's strategy has shifted seems misleading or inaccurate to me.

Evan_Gaensbauer

This was an acerbic and bitter comment I made as a reference to the fake MIRI strategy update in 2022 from Eliezer, the notorious "Dying with Dignity." I've thought about this for a few days and I'm sorry I made that nasty comment.

I was considering deleting or retracting it, though I've decided against that. The fact my comment has a significantly net negative karma score seems like punishment enough. Retracting the comment now probably wouldn't change that anyway.

I've decided against deleting or retracting this comment because its reception seems like a useful signal for MIRI to receive. At least as of the time I'm writing this reply, my original comment has received more agreement than disagreement. It's valid for you or whoever from MIRI disagrees with the perception I snarkily expressed as wrong or unserious. I expect it's still worth MIRI being aware that almost as many people still distrust as trust MIRI as being sufficiently honest in its public communications.

Malo

I expect it's still worth MIRI being aware that almost as many people still distrust as trust MIRI as being sufficiently honest in its public communications.

FWIW, I found this last bit confusing. In my experience chatting with folk, regardless of how much they agree with or like MIRI, they usually think MIRI is quite candid an honest in it’s communication.

(TBC, I do think the “Death with Dignity” post was needlessly confusing, but that’s not the same thing as dishonest.)

Bob

-5

Nonsensical cheems. Go for a walk. Have a drink. Lighten up.

Comments

Curated and popular this week

Hard-to-reverse decisions destroy option value

Stefan_Schubert·9y ago·Curated 1d ago·14m read

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·1d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. I recently built Impact List (impactlist.xyz), a site which ranks people by their positive impact via donations. The goal is t...

If you're agentic, work in biosecurity

sharmaayushmaan🔸·5d ago·7m read

Disclaimer: Although I work on the Groups Team at CEA, I’m writing this in a personal capacity, and this post does not constitute an endorsement by CEA. Agency - the realisation that you really can just do things. TL;DR Biosecurity needs people (of any background) who are agentic and have a high execution velocity and track record....

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·3d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·4d ago·3m read

Starting an EA group @ SUNY Binghamton

micahzarin·2d ago·1m read

Will Aldred

Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.

Related recent discussion:

“Does davidad's uploading moonshot work?”
- Context: David Dalrymple (aka davidad) recently outlined a concrete plan for mind uploading by 2040.

Related prediction markets:

Eliezer’s Manifold market, “If Artificial General Intelligence has an okay outcome, what will be the reason?”
- At present, the leading answer is: “Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.”
My Metaculus question, “Will mind uploading happen before AGI?”
- The current community prediction is 1%.^[3]

^{^}
ETA: I’ve just noticed that earlier today, another Forum user posted a quick take on a similar theme, asking why there’s been no EA funding for cognitive enhancement projects. See here.
^{^}
The immediate lines of reasoning I can think of for why “put all marginal effort towards pausing AI” is the best strategy right now are: i) uploading is intractable given AGI timelines, and ii) future, just-before-the-pause models—GPT-7, say—could help significantly with mind uploading R&D. But then, assuming that uploading is our best bet for getting alignment right, I think ii just shifts the discussion to things like “where is the best place to pause (with respect to the tradeoff between powerful automation of uploading R&D versus not pausing too late)?” and “are there ways to push for differential progress in models’ capabilities? (e.g., narrow superhuman ability in neuroscience research).”
What’s more, as counters to i: Firstly, most problems fall within a 100x tractability range. Secondly, even if cooperation+pause efforts are clearly higher impact right now than object-level uploading work, I think there’s still the argument that field-building for mind uploading should start now, rather than once the pause is in place. Because if field-building starts now, then with luck there’ll be a body of uploading researchers ready to make the most of a future pause. (This argument doesn’t go through if the pause lasts indefinitely, because in that case there’s time to build up the mind uploading field from scratch in the pause. But it does go through if the pause is limited or fragile, which I tentatively believe are more likely possibilities. See also Scott Alexander’s taxonomy of AI pauses.)
^{^}
Taken together, these two prediction markets arguably paint a grim picture. Namely, the trades on Eliezer’s question imply that mind uploading is the most likely way that AGI goes well for humanity, but the forecasts on my question imply that we’re very unlikely to get mind uploading before AGI.

^{^}

Although safety for the individual being uploaded would be far from guaranteed either.

^{^}

Thanks to Rob Bensinger, Gretta Duleba, Matt Fallshaw, Alex Vermeer, Lisa Thiergart, and Nate Soares for your valuable thoughts on this post.

^{^}

As Nate has written about in Superintelligent AI is necessary for an amazing future, but far from sufficient, we would consider it an enormous tragedy if humanity never developed artificial superintelligence. However, regulators may have a difficult time determining when we’ve reached the threshold “it’s now safe to move forward on AI capabilities.”

One alternative, proposed by Nate, would be for researchers to stop trying to pursue de novo AGI, and instead pursue human whole-brain emulation or human cognitive enhancement. This helps largely sidestep the issue of bureaucratic legibility, since the risks are far lower and success criteria are a lot clearer; and it could allow us to realize many of the near-term benefits of aligned AGI (e.g., for existential risk reduction).

^{^}

Various people at MIRI have different levels of hope about this. Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.

^{^}

Lisa comments:

I personally would like to note my dissenting perspective on this overall choice.
While I agree MIRI can contribute in the short term with a comms and policy push towards effective regulation, in the medium to long term I think research is our greater comparative advantage and I think we should keep substantial focus here (as well as increase empirical research staff). We should continue doing research but shift our focus more towards technical work which can support regulatory efforts (ex. safety standards, etc), including empirical work but also for example drawing on our Agent Foundations experience to produce theoretical frameworks. I think stronger technical (ideally empirically grounded) research arguments targeting scientific government advisors, regulators and lab decision makers (rather than public-oriented comms using philosophical arguments) on why there is AI risk and why mitigation makes sense, are more critically missing from the picture and a component MIRI can perhaps uniquely contribute. I also personally expect more impact to come from influencing lab decision makers and creating more of an academic/research consensus on safety risks rather than hoping for substantial regulatory success in the shorter term of the next 1–3 years.
Further, solving critical open questions in the safety standards space on how to regulate using metrics (and how to do this scientifically and correctly) seems to me a priority for at least the next 1–2 years. I think it’s important to have a diversity of perspectives including non-lab organizations like MIRI creating this knowledge base.

^{^}

In the case of our Agent Foundations research team, team size stayed the same, but we didn’t put any effort into trying to expand the team.

^{^}

We also helped set up a co-working space and related infrastructure for other AI x-risk orgs like Redwood Research. We’re fans of Redwood, and often direct researchers and engineers to apply to work there if they don’t obviously fit the far more unusual and constrained research niches at MIRI.

^{^}

Where “soon” means, roughly, “there’s a lot of uncertainty here, but it’s a very live possibility that AGI is only a few years away; and it no longer seems likely to be (for example) 30+ years away.” In a fall 2023 poll of most MIRI researchers, we expect AGI (according to the definition from this Metaculus market) in a median of 9 years and a mean of 14.6 years. One researcher was an outlier at 52 years; the majority predicted under ten years.

MIRI 2024 Mission and Strategy Update

MIRI 2024 Mission and Strategy Update

MIRI’s mission

MIRI in 2021–2022

New developments in 2023

Looking forward