I want to point out that the ethical schools of thought that you're (probably) most anti-aligned with (e.g., that certain behaviors and even thoughts are deserving of eternal divine punishment) are also far more prominent in the West, proportionately even more so than the ones you're aligned with.
For what it's worth, we recently ran a cross-cultural survey (n > 1,000 after extensive filtering) on endorsement of eternal extreme punishment, with questions like "If I could create a system that makes deserving people feel unbearable pain forever, I would" and "If hell didn't exist, or if it stopped existing, we should create it [...]".
~16-19% of Chinese respondents consistently endorsed such statements, compared to ~10–14% of US respondents—despite China being majority atheist/agnostic.[1]
Of course, online surveys are notoriously unreliable, especially on such abstract questions. But if these results hold up, concerns about eternal punishment would actually count against a China-dominated future, not in favor of one.
On individual questions, agreement rates were usually much higher, especially in China and other non-Western countries. The above numbers reflect a conservative conjunctive measure filtering for consistency across multiple questions.
Hi Luke,
Sorry, it seems that we haven't received your email (where did you send it, btw?). As stated on our website, we planned to give around $30m this year (though by now it seems that we may end up closer to $40m). We're currently considering our strategy and the available opportunities and may give substantially more (possibly >$100m/year) in the coming years.
Hi, makes sense, thanks.
I certainly don't think investigative journalism is a panacea or tremendously effective. Definitely agree that most voters are motivated by tribalism and economic conditions.
Still, even in this age, investigative journalism sometimes seems to have substantial effects. For example, the book on the Biden's cognitive decline and cover-up made some waves and hopefully has some positive effects.
More generally, I'm also not sure to what extent our age is highly unusual. It's not like people in the 1930s and 1940s (in say, Germany) weren't subject to echo chambers and didn't exhibit a few "skill issues" in critical thinking.
My sense is that many elections are decided by a few percent of voters (with around 20-40% on each side (the "base") essentially being completely immovable by any argument) so even shifting a few percent of voters journalism could have an impact.
I think investigative journalism may be particularly effective during more "hingey" moments, before a political party has rallied around a certain candidate and before tribalism kicks in.
I agree that, say, changing the underlying structure of the information eco-system is probably more effective (e.g., changing (social media) recommendation algorithms (e.g., on Youtube, Facebook, or Twitter) to promote truth-seeking and improve the quality of general discourse). (But that's probably almost impossible to pull off for any given individual unless you happen to be, say, Zuckerberg or Musk). Also, interventions in the AI for epistemics space may be much more scalable and high-leverage.
from Their wedding page 2023,
Not sure if I misunderstand something but the wedding page seems from 2017? (It reads "October 21, 2017" at the top.)
Very much agree.
Also, some of the more neglected topics tend to be more intellectually interesting and especially appealing if you have a bit of a contrarian temperament. One can make the mistake of essentially going all out on neglectedness and mostly work on the most fringe and galaxy-brained topics imaginable.
I've been there myself: I think I probably spent too much time thinking about lab universes, descriptive population ethics, etc.
Perhaps it connects to a deeper "silver bullet worldview bias": I've been too attracted to worldviews according to which I can have lots of impact. Very understandable given how much meaning and self-worth I derive from how much good I believe I do.
The real world is rather messy and crowded, so elegant and neglected ideas for having impact can become incredibly appealing, promising both outsized impact and intellectual satisfaction.
Thanks for writing this! I also voluntarily reduced my salary for several years (and lived partly off my savings) and had been meaning to write about this for some time but never got around to it. It's always been somewhat puzzling why this isn't more common. While it probably shouldn't become a norm for the reasons you outline, my sense is that more EAs should consider this option (though I may be underestimating how common it is already).
I agree with all the downsides you list but I could imagine there are also other upsides to voluntary salary reduction. For example, it can signal your commitment to both your organization and to taking altruistic ideas seriously—following the logic where it leads, even when that means doing unconventional things. This might inspire others.
I also worry that we might be biased to overestimate the downsides of voluntary salary reductions: Donating creates tangible satisfaction—the concrete act of giving, the tax receipt, the social recognition, etc. Taking a lower salary offers none of these psychological benefits and can even feel like a loss in status and recognition.
Thanks!
I haven't engaged much with the psychodynamic literature or mostly only indirectly (as some therapy modalities like CFT or ST are quite eclectic and thus reference various psychodynamic concepts) but perhaps @Clare_Diane has. Is there any specific construct, paper/book or test that you have in mind here?
I'm not familiar with the SWAP but it looks very interesting (though Clare may know it), thanks for mentioning it! As you most likely know, there even exists a National Security Edition developed in collaboration with the US government.
I just realized that in this (old) 80k podcast episode[1], Holden makes similar points and argues that aligned AI could be bad.
My sense is that Holden alludes to both malevolence ("really bad values, [...] we shouldn't assume that person is going to end up being nice") and ideological fanaticism ("create minds that [...] stick to those beliefs and try to shape the world around those beliefs", [...] "This is the religion I follow. This is what I believe in. [...] And I am creating an AI to help me promote that religion, not to help me question it or revise it or make it better.").
Longer quotes below (emphasis added):
Holden: “The other part — if we do align the AI, we’re fine — I disagree with much more strongly. [...] if you just assume that you have a world of very capable AIs, that are doing exactly what humans want them to do, that’s very scary. [...]
Certainly, there’s the fact that because of the speed at which things move, you could end up with whoever kind of leads the way on AI, or is least cautious, having a lot of power — and that could be someone really bad. And I don’t think we should assume that just because that if you had some head of state that has really bad values, I don’t think we should assume that that person is going to end up being nice after they become wealthy, or powerful, or transhuman, or mind uploaded, or whatever — I don’t think there’s really any reason to think we should assume that.
And then I think there’s just a bunch of other things that, if things are moving fast, we could end up in a really bad state. Like, are we going to come up with decent frameworks for making sure that the digital minds are not mistreated? Are we going to come up with decent frameworks for how to ensure that as we get the ability to create whatever minds we want, we’re using that to create minds that help us seek the truth, instead of create minds that have whatever beliefs we want them to have, stick to those beliefs and try to shape the world around those beliefs? I think Carl Shulman put it as, “Are we going to have AI that makes us wiser or more powerfully insane?”
[...] I think even if we threw out the misalignment problem, we’d have a lot of work to do — and I think a lot of these issues are actually not getting enough attention.”
Rob Wiblin: Yeah. I think something that might be going on there is a bit of equivocation in the word “alignment.” You can imagine some people might mean by “creating an aligned AI,” it’s like an AI that goes and does what you tell it to — like a good employee or something. Whereas other people mean that it’s following the correct ideal values and behaviours, and is going to work to generate the best outcome. And these are really quite separate things, very far apart.
Holden Karnofsky: Yeah. Well, the second one, I just don’t even know if that’s a thing. I don’t even really know what it’s supposed to do. I mean, there’s something a little bit in between, which is like, you can have an AI that you ask it to do something, and it does what you would have told it to do if you had been more informed, and if you knew everything it knows. That’s the central idea of alignment that I tend to think of, but I think that still has all the problems I’m talking about. Just some humans seriously do intend to do things that are really nasty, and seriously do not intend — in any way, even if they knew more — to make the world as nice as we would like it to be.
And some humans really do intend and really do mean and really will want to say, you know, “Right now, I have these values” — let’s say, “This is the religion I follow. This is what I believe in. This is what I care about. And I am creating an AI to help me promote that religion, not to help me question it or revise it or make it better.” So yeah, I think that middle one does not make it safe. There might be some extreme versions, like, an AI that just figures out what’s objectively best for the world and does that or something. I’m just like, I don’t know why we would think that would even be a thing to aim for. That’s not the alignment problem that I’m interested in having solved.
I'm one of those bad EAs who don't listen to all 80k episodes as soon as they come out.
Thanks Mike. I agree that the alliance is fortunately rather loose in the sense that most of these countries share no ideology. (In fact, some of them should arguably be ideological enemies, e.g., Islamic theocrats in Iran and Maoist communists in China).
But I worry that this alliance is held together by a hatred of (or ressentiment in general) Western secular democratic principles for ideological and (geo-)political reasons. Hatred can be an extremely powerful and unifying force. (Many political/ideological movements are arguably primarily defined, united, and motivated by what they hate, e.g., Nazism by the hatred of Jews, communism by the hatred of capitalists, racists hate other ethnicities, Democrats hate Trump and racists, Republicans hate the woke and communists, etc.)
So I worry that as long as Western democracies to influence international affairs, this alliance will continue to exist. And I certainly hope that Western democracies will continue to be powerful and worry that the world (and the future) will become a worse place if not.
Thanks for the comment, Michael.
On wild animal suffering
You raise a good point, and I do think WAS persisting into the long-term future is a serious concern. That said, I think the distinction between incidental and intentional suffering is absolutely crucial from a longtermist perspective.
Agents who value ecosystems or nature aesthetically don't have "create suffering" as a terminal value. The suffering is a byproduct—one they might be open to eliminate if they could do so without destroying what they actually care about. That makes this amenable to Pareto improvements: keep the ecology, remove the suffering. It's at least conceivable that those who value ecosystems would be open to interventions that reduce suffering in nature—though they'd probably dislike doing so via advanced technology like nanobots. (Though they might be open to more "natural" interventions, but more on that in a moment.)
It's also worth noting that WAS at its current Earthly scale isn't an s-risk (by definition s-risks entail vastly more suffering than currently exists on Earth). For it to become one, you'd need agents who actively spread it to other star systems and insist that all the animals keep suffering, and refuse any intervention. At that point, you're arguably describing something that could be called "ecological fanaticism": dogmatic certainty, a simplistic nature-good/intervention-evil dichotomy, and willingness to perpetuate vast suffering in service of that ideology. Admittedly, this is a bit of a definitional stretch but it's at least in the neighborhood.
As an aside, I think it's worth noting that a lot of people already care about reducing wild animal suffering in certain ways. Videos of people rescuing wild animals—dogs from drowning, deer stuck on ice—get millions of views and enthusiastic responses. There seems to be broad latent demand for reducing animal suffering when it's made salient. The vast majority of wild animal suffering persists not because people terminally value it, but because we lack the resources and technology to do much about it right now. That will change with ASI.
What's more, fanatics will resist compromise and moral trade. Someone who likes nature and has a vague preference to keep it untouched, but isn't fanatically locked into this, would presumably allow you to eliminate the suffering if you offered enough resources in return (and you do it in a way that doesn't offend their sensibilities—superintelligent agents might come up with ways of doing that). It's plausible that altruistic agents will own at least some non-trivial fraction of the cosmic endowment and would be happy to spend it on exactly such trades. Fanatical agents, by contrast, won't trade or compromise.
Where I think the concern about fanaticism becomes most acute is with agents who believe that deliberately creating suffering is morally desirable—e.g., extreme retributivist attitudes wanting to inflict extreme eternal torment. If people with such values have access to ASI, the resulting suffering could dwarf WAS by orders of magnitude, especially factoring in intensity. That's the type of scenario we're trying to draw attention to.
On the atrocity table and intentional deaths
I also received a somewhat similar concern via DM: filtering for intentional deaths and then finding fanaticism is circular reasoning. I don't think it is because intentional ≠ ideologically fanatical. You can have intentional mass killing driven by strategic interest, resource extraction, personal megalomania, etc. (And the table does indeed include 2 non-fanatical examples). The finding is that among the worst intentional mass killings, most involved ideological fanaticism. This is a substantive empirical result, not a tautology.
Including famines wouldn't even change the picture that much. You'd add the British colonial famines, the Chinese famine of 1907, and Mao's Great Leap Forward (though the Great Leap Forward was itself clearly driven by fanatical ideological zealotry, and certain ideologies—colonialism, laissez-faire ideology, etc.—probably also substantially contributed to the British famines in India).
More importantly, once you start including famines, why not also include pandemics? And once you include pandemics, why not deaths from disease more generally—cancer, heart disease, etc.? And why not include deaths from aging then? Obviously, the vast majority of deaths since 1800 were not due to fanaticism; most were from hunger, disease, and aging.
But with sufficiently advanced technology, you won't have deaths from disease, hunger, or aging. These deaths don't reveal anything about terminal preferences. Intentional deaths do. That's why, from a longtermist perspective, focusing on intentional deaths isn't cherry-picking, it's studying the thing that actually matters for predicting what the long-term future looks like.
ASI will give agents enormous control over the universe, so the future will be shaped primarily by the terminal values of whoever controls that technology. Unintentional mass death from incompetence or nature (like aging) is terrible, but solvable.
Last, I worry that we're getting too hung up on the atrocity table. Even in a world where ideological fanaticism had resulted in only a few historical atrocities, I'd still be concerned about it as a long-term risk. The table is just one outside-view / historical argument among several for why we should take fanaticism seriously. The core reasons for worrying about fanaticism are mostly discussed in these sections.