Implications of evidential cooperation in large worlds

Lukas Finnveden

Implications of evidential cooperation in large worlds

Lukas Finnveden

21 min read

Comments 1

Sorted by

New & upvoted

Wei Dai

One way to affect things is to increase the probability that humanity ends up building a healthy and philosophically competent civilization. (But we already knew that was important.)

Do you know anyone who is actually working on this, especially the second part (philosophical competence)? I've been thinking about this myself, and wrote some LW posts on the topic. (In short, my main message is that if we care about our collective philosophical competence, the AI transition represents both a high risk and a unique opportunity.) But I feel like my public and private efforts to attract more attention and work to this area haven't yielded much. Do you see things differently?

Comments

More from the author

Being honest with AIs

Lukas Finnveden·11mo ago·21m read

154

AGI and Lock-In

Lukas Finnveden, Jess_Riedel, CarlShulman·3y ago·Curated 3y ago·12m read

What's important in "AI for epistemics"?

Lukas Finnveden·1y ago·34m read

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·1w ago·Curated 6d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

How (not) to fundraise from Anthropic staff

Jack Lewars·6d ago·7m read

Adapted from my Substack, Funding Anthropalypse. Short version: if you want a share of the coming Anthropic and OpenAI windfall - the $37bn+ that could be in play next year - the way in is to become 'legibly excellent', so the evaluators and donors that frontier lab staff already trust point them to yo...

If you're agentic, work in biosecurity

sharmaayushmaan🔸·4d ago·7m read

Disclaimer: Although I work on the Groups Team at CEA, I’m writing this in a personal capacity, and this post does not constitute an endorsement by CEA. Agency - the realisation that you really can just do things. TL;DR Biosecurity needs people (of any background) who are agentic and have a high execution velocity and track record....

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·2d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·2d ago·3m read

Starting an EA group @ SUNY Binghamton

micahzarin·1d ago·1m read

^{^}

For even more references, see all the content gathered on this page, and more recently, this post written by Paul Christiano and this paper by Johannes Treutlein.

^{^}

If you know any plausible implication that I don’t list here — then either I don’t buy that it’s an implication of ECL, or it doesn’t seem sufficiently decision-relevant to me, or I haven’t thought about it / forgot about it and you should let me know.

^{^}

Whereas today, we can focus on handing-off the future to a broadly competent and healthy civilization, and trust decisions about what to do with the future to them.

^{^}

When I discuss how we should “care more about other humans’ universe-wide values”, I exclusively refer to universe-wide values held by humans on our current planet Earth, as opposed to values that might be held by distant human-like species. But the reason to benefit such values is to generate evidence that other people benefit our values on distant planets (not just here, on planet Earth). So why focus specifically on humans’ values? The reason is that we are more confident that some people treasure them, and it’s easy to benefit them via supporting humans who support them. For more, see here.

^{^}

“Misaligned AI” refers to AI whose values are very different from what was intended by the evolved species that first created them. If a distant species has very different values from us, and successfully aligns AI systems that they create, I wouldn’t count those as “misaligned AIs”.

^{^}

Or any other kind of acausal effects.

^{^}

Premature commitments are often a gamble that might gain you a better bargaining position while carrying a risk of everyone getting a lower payoff. Since that’s quite uncooperative, it seems plausible that ECL could discourage premature commitments. So this might be a reason to spread knowledge about ECL.

^{^}

Though also possible that uncareful thinking could increase them — given that they are by-their-nature caused by humanity making errors in what order they learn about and commit to doing certain things.

^{^}

And ideally, you would also think about other opportunities that faction A and faction B would have of benefiting each other, since you might also be providing evidence about those. Even more ideally, you might think about possible gains from trades that involve even more factions.

^{^}

Though the total effort that goes to each should perhaps still be allocated based on the number of people who support each set of values and who are sympathetic to ECL. Potentially adjusted by speculation about whether either set of values is underrepresented (among ECL-sympathizers) on Earth compared to the universe-at-large, in which case we should prioritize that set of values higher.

^{^}

It will be the most evidence for the actions of people in exactly my position. But this is not where most of my acausal influence will come from, since even a small amount of evidence across a sufficiently larger number of actors will weigh higher. The hypothesis that I’m putting forward here is that there might be some fairly broad class of actors which still share some key similarities with you, whose decisions your decisions provide more evidence about. And that your values might be (or be correlated with) one of the key similarities.

^{^}

Though I am personally somewhat sympathetic to both upside- and downside-focused values, so this doesn’t have a big impact on my all-things-considered view.

^{^}

Even if the aliens who went extinct shared our values, their choice to prioritize non-AI extinction risk less could still have been net-positive ex-ante. For example, they might have reallocated resources in a way that reduced AI takeover risk by 0.1% and increased non-AI extinction risk by 0.1001%. The added 0.0001% of x-risk might have been worth the benefit of leaving behind empty space rather than AI-controlled space in 0.1% of worlds.

^{^}

In particular, ECL suggests that we should discount benefits to aliens insofar as they on average correlate less strongly with us than the average civilizations-with-our-values do. (When making relevant decisions.)

^{^}

As an example of someone with this view: This facebook post by Eliezer Yudkowsky starts “I think that I care about things that would, in your native mental ontology, be imagined as having a sort of tangible red-experience or green-experience, and I prefer such beings not to have pain-experiences. Happiness I value highly is more complicated.” Yudkowsky has also written about the complexity and fragility of value elsewhere, e.g. here.

^{^}

In particular, ECL suggests that we should discount benefits to AI insofar as they correlate less strongly with us than actors-with-our-values do.

Implications of evidential cooperation in large worlds

Implications of evidential cooperation in large worlds

Summary (with links to sub-sections)

Affect whether (and how) future actors do ECL

Futures with aligned AI

Futures with misaligned AI

How us doing ECL affects our priorities

Care more about other humans’ universe-wide values

It matters less which universe-wide values control future resources (seems minor in practice?)

Upside- and downside-focused longtermists should care more about each others’ values

Care more about evolved aliens’ universe-wide values

Minor: Prioritize non-AI extinction risk less highly

Influence how AI benefits/harms alien civilizations’ values

Possibly: Weigh suffering-focused values somewhat higher if they are more universal

Care more about misaligned AIs’ universe-wide values

Minor: Prioritize AI takeover risk less highly

Positively influence misaligned AI

More

Appendices

What values do you need for this to be relevant?

More details on the split between humans, evolved species, and misaligned AI

Why distinguish humans from aliens?

Why distinguish evolved aliens from misaligned AIs?

Acknowledgments