I am the Principal Research Director at Rethink Priorities. I lead our Surveys and Data Analysis department and our Worldview Investigation Team.
The Worldview Investigation Team previously completed the Moral Weight Project and CURVE Sequence / Cross-Cause Model. We're currently working on tools to help EAs decide how they should allocate resources within portfolios of different causes, and to how to use a moral parliament approach to allocate resources given metanormative uncertainty.
The Surveys and Data Analysis Team primarily works on private commissions for core EA movement and longtermist orgs, where we provide:
Formerly, I also managed our Wild Animal Welfare department and I've previously worked for Charity Science, and been a trustee at Charity Entrepreneurship and EA London.
My academic interests are in moral psychology and methodology at the intersection of psychology and philosophy.
Survey methodology and data analysis.
Thanks Vasco.
I think there are two distinct questions here: the sample and the analysis.
When resources allow, I think it is often better to draw a representative sample of the full population, and then analyse the effect of different traits / the effects within different sub-populations. Even if the people getting involved in EA have tended to be younger, more educated etc., I think there are still reasons to be concerned about the views of the broader population. Of course, if you are interested in a very niche population then this approach will either not be possible or will be very resources inefficient and you might have to sample more narrowly.
When looking for any interactions with demographics for these results, I found no significant demographic interactions, which is not uncommon. I focus here on the replication study in order to have a cleaner manipulation of the "doing good better" effect, though it's a smaller sample, and we gathered fewer demographics than in the representative sample in the first study.
For age, we see a fairly consistent effect (the trends are fairly linear even if I allow them to be non-linear, fwiw).
For student status, we see no interaction effect and the same main effect.
For sex, we see no effect and a consistent main effect.[1]
If there's a lot of interest this we could potentially look at education and income in the first survey.
People sometimes ask why we use sex rather than gender in public surveys and it's usually to match the census so that we can weight.
there’s a limit to how much you can learn in a structured interview, because you can’t adapt your questioning on the fly if you notice some particular strength or weakness of a candidate.
I agree. Very often I think that semi-structured interviews (which has a more or less closely planned structure, with the capacity to deviate), will be the best compromise between fully structured and fully unstructured interviews. I think it's relatively rare that the benefits of being completely structured outweigh the benefits of at least potentially asking a relevant followup question, and rare that the benefits of being completely unstructured outweigh the benefits of having at least a fairly well developed plan, with key questions to ask going on.
Thanks Jakob!
For example, there is a lot of discussion on whether we should focus on far future people or present people. This seems to be an instance of CP, but still you could say that the two causes are within the super-ordinate cause of "Human Welfare".
This is a great example! I think there is a real tension here.
On the one hand, typically we would say that someone comparing near-term human work vs long-term human work (as broad areas) is engaged in cause prioritisation. And the character of the assessment of areas this broad will likely be very much like the character we ascribed to cause prioritisation (i.e. concerning abstract assessment of general characteristics of very broad areas). On the other hand, if we're classifying the allocation of movement as a whole across different types of prioritisation, it's clear that prioritisation that was only focused on near-term vs long-term human comparisons would be lacking something important in terms of actually trying to identify the best cause (across cause areas). To give a different example, if the movement only compared invertebrate non-humans vs vertebrate non-humans, I think it's clear that we'd have essentially given up on cause prioritisation, in an important sense.[1]
I think what I would say here is something like: the term "cause (prioritisation)" is typically associated with multiple different features, which typically go together, but which in edge cases can come apart. And in those cases, it's non-obvious how we should best describe the case, and there are probably multiple equally reasonable terminological descriptions. In our system, using just the main top-level EA cause areas, classification may be relatively straightforward, but if you divide things differently or introduce subordinate or superordinate causes, then you need to introduce some more complex distinctions like sub-cause-level within-cause prioritisation.
That aside, I think even if you descriptively divide up the field somewhat differently, the same normative points about the relative strengths and weaknesses of prioritisation focuses on larger or smaller objects of analysis (more cause-like vs more intervention-like) and narrower or wider in scope (within a single area vs across more or all areas), can still be applied in the same way. And, descriptively, it still seems like the the movement still has relatively little prioritisation that is more broadly cross-area.
One thing this suggests is that you might think of this slightly differently when you are asking "What is this activity like?" at the individual level vs asking "What prioritisation are we doing?" at the movement level. A more narrowly focused individual project might be a contribution to wider cause prioritisation. But if, ultimately, no-one is considering anything outside of a single cause area, then we as a movement are not doing any broader cause prioritisation.
Thanks David.
I certainly agree that we should be careful to make sure that we don't over-optimise short-term appeal at the cost of other things that matter (e.g. long-term engagement, accuracy and fidelity of the message, etc.). I don't think we're calling for people to only consider this dimension: we explicitly say that we "think that people should assess particular cases on the basis of all the details relevant to the particular case in question."
That said, I think that there are many cases where those other dimensions won't, in fact, be diminished by selecting messages which are more appealing.[1] For example:
All that said, I certainly think that we should be careful not to over-optimise any single dimension, but instead carefully weigh all the relevant factors.
Though note that these are not arguments that we should assume that other considerations don't matter. We should still assess each case on its merits and weigh all the considerations directly.
Thanks for your comment Jakob! A few thoughts:
Although you could do so for the best single intervention within the cause.
An alternative hypothesis is that less time is being devoted to these kinds of questions (see here and here).
This potentially has somewhat complex effects, i.e. it's not just that you get fewer novel insights with 100 hours spent thinking than 200 hours spent thinking, but that you get more novel insights from 100 hours spent thinking when doing so against a backdrop of lots of other people thinking and generating ideas in an active intellectual culture.
To be clear, I don't think this totally explains the observation. I also think that it's true, to some extent, that the lowest hanging fruit has been picked, and that this kind of volume probably isn't optimising for weird new ideas.
Perhaps related to the second point, I also think it may be the case that relatively more recent work in this area has been 'paradigmatic' rather than 'pre-paradigmatic' or 'crisis stage', which likely generates fewer exciting new insights.
A striking finding is that the area where people expect the greatest positive impact from AI is biosecurity and pandemic prevention
It seems like this might simply be explained by "biosecurity and pandemic prevention" containing two very different things: 'novel biosecurity risks' (of the kind EAs are concerned about) and 'helping with the next covid-19' (likely more salient to the general public and potentially involving broader healthcare improvements, which AI was also predicted to improve to a similar extent).
Perhaps relatedly, biosecurity and pandemic prevention was rated as the least a problem today (below everything other than AI itself).
Post-FTX, I think core EA adopted a “PR mentality” that (i) has been a failure on its own terms and (ii) is corrosive to EA’s soul.
I find it helpful to distinguish two things, one which I think EA is doing too much of and one which EA is doing too little of:
Currently, the online EA ecosystem doesn’t feel like a place full of exciting new ideas, in a way that’s attractive to smart and ambitious people
This may be partly related to the fact that EA is doing relatively little cause and cross-cause prioritisation these days (though, since we posted this, GPI has wound down and Forethought has spun up).
People may still be doing within-cause, intervention-level prioritisation (which is important), but this may be unlikely to generate new, exciting ideas, since it assumes causes, and works only within them, is often narrow and technical (e.g. comparing slaughter methods), and is often fundamentally unsystematic or inaccessible (e.g. how do I, a grantmaker, feel about these founders?).
Thanks Jakob!
One thing I'll add to this, which I think is important, is that it may matter significantly how people are engaged in prioritisation within causes. I think it may be surprisingly common for within-cause prioritisation, even at the relatively high sub-cause level, to not help us form a cross-cause prioritization to a significant extent.
To take your earlier example: suppose you have within-animal prioritisers prioritising farmed animal welfare vs wild animal welfare. They go back and forth on whether it's more important that WAW is bigger in scale, or that FAW is more tractable, and so on. To what extent does that allow us to prioritise wild animal welfare vs biosecurity, which the GCR prioritisers have been comparing to AI Safety? I would suggest, potentially not very much.
It might seem like work that prioritises FAW vs WAW (within animals) and AI Safety vs biosecurity (within GCR), would allow us to compare any of the considered sub-causes to each other. If these 'sub-cause' prioritisation efforts gave us cost-effectiveness estimates in the same currency then they might, in principle. But I think that very often:
Someone doing within-cause prioritisation could complain that most of the prioritisation they do is not like this, that it is more intervention-focused and not high-level sub-cause focused, and that it does give cost-effectiveness estimates. I agree that within-cause prioritisation that gives intervention-level cost-effectiveness estimates is potentially more useful for building up cross-cause prioritisations. But even these cases will typically still be limited by the second and third bulletpoints above (I think the Cross-Cause Model is a rare example of the kind of work needed to generate actual cross-cause prioritisations from the ground up, based on interventions).