Luke Kemp and I just published a paper which criticises existential risk for lacking a rigorous and safe methodology:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3995225
It could be a promising sign for epistemic health that the critiques of leading voices come from early career researchers within the community. Unfortunately, the creation of this paper has not signalled epistemic health. It has been the most emotionally draining paper we have ever written.
We lost sleep, time, friends, collaborators, and mentors because we disagreed on: whether this work should be published, whether potential EA funders would decide against funding us and the institutions we're affiliated with, and whether the authors whose work we critique would be upset.
We believe that critique is vital to academic progress. Academics should never have to worry about future career prospects just because they might disagree with funders. We take the prominent authors whose work we discuss here to be adults interested in truth and positive impact. Those who believe that this paper is meant as an attack against those scholars have fundamentally misunderstood what this paper is about and what is at stake. The responsibility of finding the right approach to existential risk is overwhelming. This is not a game. Fucking it up could end really badly.
What you see here is version 28. We have had approximately 20 + reviewers, around half of which we sought out as scholars who would be sceptical of our arguments. We believe it is time to accept that many people will disagree with several points we make, regardless of how these are phrased or nuanced. We hope you will voice your disagreement based on the arguments, not the perceived tone of this paper.
We always saw this paper as a reference point and platform to encourage greater diversity, debate, and innovation. However, the burden of proof placed on our claims was unbelievably high in comparison to papers which were considered less “political” or simply closer to orthodox views. Making the case for democracy was heavily contested, despite reams of supporting empirical and theoretical evidence. In contrast, the idea of differential technological development, or the NTI framework, have been wholesale adopted despite almost no underpinning peer-review research. I wonder how much of the ideas we critique here would have seen the light of day, if the same suspicious scrutiny was applied to more orthodox views and their authors.
We wrote this critique to help progress the field. We do not hate longtermism, utilitarianism or transhumanism,. In fact, we personally agree with some facets of each. But our personal views should barely matter. We ask of you what we have assumed to be true for all the authors that we cite in this paper: that the author is not equivalent to the arguments they present, that arguments will change, and that it doesn’t matter who said it, but instead that it was said.
The EA community prides itself on being able to invite and process criticism. However, warm welcome of criticism was certainly not our experience in writing this paper.
Many EAs we showed this paper to exemplified the ideal. They assessed the paper’s merits on the basis of its arguments rather than group membership, engaged in dialogue, disagreed respectfully, and improved our arguments with care and attention. We thank them for their support and meeting the challenge of reasoning in the midst of emotional discomfort. By others we were accused of lacking academic rigour and harbouring bad intentions.
We were told by some that our critique is invalid because the community is already very cognitively diverse and in fact welcomes criticism. They also told us that there is no TUA, and if the approach does exist then it certainly isn’t dominant. It was these same people that then tried to prevent this paper from being published. They did so largely out of fear that publishing might offend key funders who are aligned with the TUA.
These individuals—often senior scholars within the field—told us in private that they were concerned that any critique of central figures in EA would result in an inability to secure funding from EA sources, such as OpenPhilanthropy. We don't know if these concerns are warranted. Nonetheless, any field that operates under such a chilling effect is neither free nor fair. Having a handful of wealthy donors and their advisors dictate the evolution of an entire field is bad epistemics at best and corruption at worst.
The greatest predictor of how negatively a reviewer would react to the paper was their personal identification with EA. Writing a critical piece should not incur negative consequences on one’s career options, personal life, and social connections in a community that is supposedly great at inviting and accepting criticism.
Many EAs have privately thanked us for "standing in the firing line" because they found the paper valuable to read but would not dare to write it. Some tell us they have independently thought of and agreed with our arguments but would like us not to repeat their name in connection with them. This is not a good sign for any community, never mind one with such a focus on epistemics. If you believe EA is epistemically healthy, you must ask yourself why your fellow members are unwilling to express criticism publicly. We too considered publishing this anonymously. Ultimately, we decided to support a vision of a curious community in which authors should not have to fear their name being associated with a piece that disagrees with current orthodoxy. It is a risk worth taking for all of us.
The state of EA is what it is due to structural reasons and norms (see this article). Design choices have made it so, and they can be reversed and amended. EA fails not because the individuals in it are not well intentioned, good intentions just only get you so far.
EA needs to diversify funding sources by breaking up big funding bodies and by reducing each orgs’ reliance on EA funding and tech billionaire funding, it needs to produce academically credible work, set up whistle-blower protection, actively fund critical work, allow for bottom-up control over how funding is distributed, diversify academic fields represented in EA, make the leaders' forum and funding decisions transparent, stop glorifying individual thought-leaders, stop classifying everything as info hazards…amongst other structural changes. I now believe EA needs to make such structural adjustments in order to stay on the right side of history.
Hello Luke, thanks for this, which was illuminating. I'll make an initial clarifying comment and then go on to the substantive issues of disagreement.
I'm not sure what you mean here. Are you saying GiveWell didn't repeatedly ignore the work? That Open Phil didn't? Something else? As I set out in another comment, my experience with GiveWell staff was of being ignored by people who weren't at that familiar with the relevant literature - FWIW, I don't recall the concerns you raise in your notes being raised with me. I've not had interactions with Open Phil staff prior to 2021 - for those reading, Luke and I have never spoken - so I'm not able to comment regarding that.
Onto the substantive issues. Would you be prepared to more precisely state what your concerns are, and what sort of evidence would chance your mind? Reading your comments and your notes, I'm not sure exactly what your objections are and, in so far as I do, they don't seem like strong objections.
You mention "weakly validated measures" as an issue but in the text you say "for some scales, reliability and validity have been firmly established", which implies to me you think (some) scales are validated. So which scales are you worried about, to what extent, and why? Are they so non-validated we should think they contain no information? If some scales are validated, why not just use those ones? By analogy, we wouldn't give up on measuring temperature if we thought only some of our thermometers were broken. I'm not sure if we're even on the same page about what it is to 'validate' a measure of something (I can elaborate, if helpful).
On "unconvincing intervention studies", I take it you're referring to your conversation notes with Sonja Lyubormirsky. The 'happiness interventions' you talk about are really just those from the field of 'positive psychology' where, basically, you take mentally healthy people and try to get them to change their thoughts and behaviours to be happier, such as by writing down what they're grateful for. This implies a very narrow interpretation of 'happiness interventions'. Reducing poverty or curing diseases are 'happiness interventions' in my book because they increase happiness, but they are certainly not positive psychology interventions. One can coherently think that subjective wellbeing measures, eg self-reported happiness, are valid and capture something morally important but deny gratitude journalling etc. are particularly promising ways, in practice, of increasing it. Also, there's a big difference between the lab-style experiments psychologists run and the work economists tend to do looking at large panel and cohort data sets.
Regarding "one entire literature using the wrong statistical test for decades", again, I'm not sure exactly what you mean. Is the point about 'item response theory'? I confess that's not something that gets discussed in the academic world of subjective wellbeing measurement - I don't think I've ever heard it mentioned. After a quick look, it seems to be a method to relate scores of psychometric tests to real-world performance. That seems to be a separate methodological ballgame from concerns about the relationship between how people feel and how they report those feelings on a numerical scale, e.g. when we ask "how happy are you, 0-10?". Subjective wellbeing researchers do talk about the issue of 'scale cardinality', ie, roughly, does your "7/10" feel the same to you as my "7/10" does to me? This issue has been starting to get quite a bit of attention in just the last couple of years but has, I concede, been rather neglected by the field. I've got a working paper on this under review which is (I think) the first comprehensive review of the problem.
To me, it looks like in your initial investigation you had the bad luck to run into a couple of dead ends and, quite understandably given those, didn't go further. But I hope you'll let me try to explain further to you why I think happiness research (like happiness itself) is worth taking seriously!