DM

David_Moss

Principal Research Director @ Rethink Priorities
10135 karmaJoined Working (6-15 years)

Bio

I am the Principal Research Director at Rethink Priorities. I lead our Surveys and Data Analysis department and our Worldview Investigation Team. 

The Worldview Investigation Team previously completed the Moral Weight Project and CURVE Sequence / Cross-Cause Model. We're currently working on tools to help EAs decide how they should allocate resources within portfolios of different causes, and to how to use a moral parliament approach to allocate resources given metanormative uncertainty.

The Surveys and Data Analysis Team primarily works on private commissions for core EA movement and longtermist orgs, where we provide:

  • Private polling to assess public attitudes
  • Message testing / framing experiments, testing online ads
  • Expert surveys
  • Private data analyses and survey / analysis consultation
  • Impact assessments of orgs/programs

Formerly, I also managed our Wild Animal Welfare department and I've previously worked for Charity Science, and been a trustee at Charity Entrepreneurship and EA London.

My academic interests are in moral psychology and methodology at the intersection of psychology and philosophy.

How I can help others

Survey methodology and data analysis.

Sequences
4

EA Survey 2024
RP US Public AI Attitudes Surveys
EA Survey 2022
EA Survey 2020

Comments
654

Thanks gergo!
 

A small number of small orgs do meet the bar, though! AIS Cape town, EA Philippines, EA & AIS Hungary (probably at least some others) has been funded consistently for years. The bar is really high though for these groups, and I guess funders don't see enough good opportunities to support to justify investing more resources into them through external services. (Maybe this is what you meant anyway but I wasn't sure)

My claim is actually slightly different (though much closer to your second claim than your first). It's definitely not that no small groups are funded (obviously untrue), but that funders are often not interested in funding work on the strength of it supporting smaller groups, where "smaller" includes the majority of orgs.

The highest counterfactual impact comes from working with organisations that could benefit but haven’t budgeted for marketing due to a lack of understanding.

As JS from Good Impressions told us:
“Willingness to pay is not as strong a predictor of commitment or perceived value as I thought it would be.”

This creates a chicken-and-egg problem: funders expect clients to pay, but clients lack the means. 

 

This matches our own experience (with the Rethink Priorities, Surveys and Data Analysis team).

I would add that, in our experience, the situation is worse than the chicken-and-egg problem as stated. As you note, funders are often not interested in funding work which is supporting smaller, more peripheral or less established groups (and to be clear, this seems to be a matter of 'most orgs don't meet the bar' rather than 'all but a few smaller groups do meet the bar'). 

But we have also been told by more than one funder that if our work is supporting core, well-resourced orgs, then those orgs ought to fund it themselves, and you shouldn't need central funding.[1] This creates a real catch-22 situation, where projects of this kind can neither be funded if they are supporting the biggest orgs or if they're not.

I also find that people often significantly overestimate the ability to pay of even the largest orgs to pay. We often find that orgs are willing to invest tens of staff hours in working with us on a project- implying they value it- but they still have hard limits of whether they can spend $500-1000 on costs for the project.[2]

  1. ^

    I've not directly experienced this response recently, as we've not been applying for funding on this sort of basis, so YMMV.

  2. ^

    Perhaps explained by (i) even well-resourced orgs don't have large amounts of unrestricted funds that they can spend on whatever unforeseen expenses they want (ii) internal approvals for funding being difficult, (iii) needing/wanting to stick to some, pretty low, sense of what reasonable costs for advertising/experiments are.

I am curious on the communities thoughts on this lack of diversity 

 

In previous surveys, diversity/JEID related issued have often been mentioned as reasons for dissatisfaction. That said, there are diverse views about the topic (many of which you can read here, it's much discussed).

Community Health Supplementary Survey 

alt_text
alt_text

EA Survey 2019: Community Information

Thanks Jakob!

One thing I'll add to this, which I think is important, is that it may matter significantly how people are engaged in prioritisation within causes. I think it may be surprisingly common for within-cause prioritisation, even at the relatively high sub-cause level, to not help us form a cross-cause prioritization to a significant extent.

To take your earlier example: suppose you have within-animal prioritisers prioritising farmed animal welfare vs wild animal welfare. They go back and forth on whether it's more important that WAW is bigger in scale, or that FAW is more tractable, and so on. To what extent does that allow us to prioritise wild animal welfare vs biosecurity, which the GCR prioritisers have been comparing to AI Safety? I would suggest, potentially not very much. 

It might seem like work that prioritises FAW vs WAW (within animals) and AI Safety vs biosecurity (within GCR), would allow us to compare any of the considered sub-causes to each other. If these 'sub-cause' prioritisation efforts gave us cost-effectiveness estimates in the same currency then they might, in principle. But I think that very often:

  • Such prioritisation efforts don't give us cost-effectiveness estimates at all (e.g. they just evaluate cruxes relevant to making relative within-cause comparisons).
  • Even if they did, the cost-effectiveness would not be comparable across causes without much additional cross-cause work on moral weights and so on.
  • There may be additional incommensurable epistemic differences between the prioritisations conducted within the different causes that mean we can't combine their prioritisations (e.g. GHD favours more well-evidenced, less speculative things and prioritises A>B, GCR favours higher EV, more speculative things and prioritises C>D).

Someone doing within-cause prioritisation could complain that most of the prioritisation they do is not like this, that it is more intervention-focused and not high-level sub-cause focused, and that it does give cost-effectiveness estimates. I agree that within-cause prioritisation that gives intervention-level cost-effectiveness estimates is potentially more useful for building up cross-cause prioritisations. But even these cases will typically still be limited by the second and third bulletpoints above (I think the Cross-Cause Model is a rare example of the kind of work needed to generate actual cross-cause prioritisations from the ground up, based on interventions).

Thanks Vasco.

I think there are two distinct questions here: the sample and the analysis

When resources allow, I think it is often better to draw a representative sample of the full population, and then analyse the effect of different traits / the effects within different sub-populations. Even if the people getting involved in EA have tended to be younger, more educated etc., I think there are still reasons to be concerned about the views of the broader population. Of course, if you are interested in a very niche population then this approach will either not be possible or will be very resources inefficient and you might have to sample more narrowly.

When looking for any interactions with demographics for these results, I found no significant demographic interactions, which is not uncommon. I focus here on the replication study in order to have a cleaner manipulation of the "doing good better" effect, though it's a smaller sample, and we gathered fewer demographics than in the representative sample in the first study. 

For age, we see a fairly consistent effect (the trends are fairly linear even if I allow them to be non-linear, fwiw).

For student status, we see no interaction effect and the same main effect.

For sex, we see no effect and a consistent main effect.[1] 

If there's a lot of interest this we could potentially look at education and income in the first survey.

  1. ^

    People sometimes ask why we use sex rather than gender in public surveys and it's usually to match the census so that we can weight.

there’s a limit to how much you can learn in a structured interview, because you can’t adapt your questioning on the fly if you notice some particular strength or weakness of a candidate. 

 

I agree. Very often I think that semi-structured interviews (which has a more or less closely planned structure, with the capacity to deviate), will be the best compromise between fully structured and fully unstructured interviews. I think it's relatively rare that the benefits of being completely structured outweigh the benefits of at least potentially asking a relevant followup question, and rare that the benefits of being completely unstructured outweigh the benefits of having at least a fairly well developed plan, with key questions to ask going on.

Thanks Jakob!

For example, there is a lot of discussion on whether we should focus on far future people or present people. This seems to be an instance of CP, but still you could say that the two causes are within the super-ordinate cause of "Human Welfare".

This is a great example! I think there is a real tension here. 

On the one hand, typically we would say that someone comparing near-term human work vs long-term human work (as broad areas) is engaged in cause prioritisation. And the character of the assessment of areas this broad will likely be very much like the character we ascribed to cause prioritisation (i.e. concerning abstract assessment of general characteristics of very broad areas). On the other hand, if we're classifying the allocation of movement as a whole across different types of prioritisation, it's clear that prioritisation that was only focused on near-term vs long-term human comparisons would be lacking something important in terms of actually trying to identify the best cause (across cause areas). To give a different example, if the movement only compared invertebrate non-humans vs vertebrate non-humans, I think it's clear that we'd have essentially given up on cause prioritisation, in an important sense.[1]

I think what I would say here is something like: the term "cause (prioritisation)" is typically associated with multiple different features, which typically go together, but which in edge cases can come apart. And in those cases, it's non-obvious how we should best describe the case, and there are probably multiple equally reasonable terminological descriptions. In our system, using just the main top-level EA cause areas, classification may be relatively straightforward, but if you divide things differently or introduce subordinate or superordinate causes, then you need to introduce some more complex distinctions like sub-cause-level within-cause prioritisation.

That aside, I think even if you descriptively divide up the field somewhat differently, the same normative points about the relative strengths and weaknesses of prioritisation focuses on larger or smaller objects of analysis (more cause-like vs more intervention-like) and narrower or wider in scope (within a single area vs across more or all areas), can still be applied in the same way. And, descriptively, it still seems like the the movement still has relatively little prioritisation that is more broadly cross-area.

 

  1. ^

    One thing this suggests is that you might think of this slightly differently when you are asking "What is this activity like?" at the individual level vs asking "What prioritisation are we doing?" at the movement level. A more narrowly focused individual project might be a contribution to wider cause prioritisation. But if, ultimately, no-one is considering anything outside of a single cause area, then we as a movement are not doing any broader cause prioritisation.

Thanks David.

I certainly agree that we should be careful to make sure that we don't over-optimise short-term appeal at the cost of other things that matter (e.g. long-term engagement, accuracy and fidelity of the message, etc.). I don't think we're calling for people to only consider this dimension: we explicitly say that we "think that people should assess particular cases on the basis of all the details relevant to the particular case in question."

That said, I think that there are many cases where those other dimensions won't, in fact, be diminished by selecting messages which are more appealing.[1] For example:

  • In some cases, like this one, we're selecting within taglines that had already been selected as suitable candidates based on other factors (such as accuracy). We're then additionally considering data on how people actually respond.
  • Relatedly, we might have messages available which seem equally good on the other key dimensions which we care about, but which we know to have higher appeal. For example, in this case, I think "the most good you can do" is at least as accurate and likely to encourage long-term engagement as "doing good better". So, if this phrasing performs better in terms of initially appealing to people, this is a pro tanto consideration in its favour.
  • In many contexts, we are only considering the question of which ~5 word tagline to include on a website. So the common tradeoff between shorter, more initially appealing messages and longer but higher-fidelity ones may not apply.
  • In some cases, like short website taglines, most of the effect of the messages may be whether the person continues reading at all (in which case they read the rest of the website and learn a lot more content) or whether they are instantly turned off. We might not expect the short taglines to have a long-term effect on people's understanding of EA (dominating all the later content they read) themselves.
  • While initial appeal and long-term engagement could diverge, in many cases, there's no particular reason to think that they do. This framing suggests a tradeoff between what is merely superficially appealing vs what promotes long-term engagement. But, often, one message might just appeal less to people simpliciter, e.g. because people find it confusing or off-putting, without promoting any longterm benefits.
  • More generally, I think that we can often think of plausible ways that a message might appear more appealing, but actually be sub-optimal, when taking into account long-term second order effects or divergent effects across different subgroups and so on. But in such cases I think we typically need more investigation of how people respond, not less.

All that said, I certainly think that we should be careful not to over-optimise any single dimension, but instead carefully weigh all the relevant factors.

 

  1. ^

    Though note that these are not arguments that we should assume that other considerations don't matter. We should still assess each case on its merits and weigh all the considerations directly.

Thanks for your comment Jakob! A few thoughts:

  • I think if we individuated "causes" in more fine-grained way, e.g. "Animal Welfare" -> "Plant-based meat alternatives", "Corporate Campaigns" etc., this might not actually change our analysis that much. Why? Prima facie, there are some more people who are working on questions like PBMA vs corporate campaigns, who would otherwise be counted as within-cause prioritisation in our current framework. But, crucially, these researchers are still only making prioritisations within the super-ordinate Animal Welfare cause. They're not comparing e.g. PBMA to nuclear security initiatives. So I think you would need to say something: like these people are engaged in cause-level but within-cause prioritisation. This is technically a kind of (sub-)cause level prioritisation, but it lacks the cross-cause comparison that our CP and CCP has due to still being constrained within a single cause.
  • The other thing that I'd note is that we also draw attention to the characteristic styles, and strengths and weaknesses, of cause prioritisation and intervention-level prioritisation. So, we argue, cause prioritisation is characterised more by abstract consideration of general features of the cause, whereas intervention-level prioritisation can increasingly attend to, more closely evaluate and potentially empirically study the specific details of the particular intervention in question. For example, it's not possible to do a meaningful cost-effectiveness analysis of 'Animals' writ large,[1] but it is possible to do so for a particular animal intervention. I would speculate that as you individuated causes in an increasingly fine-grained way, then their evaluation and prioritisation might become more intervention-like and less cause-like, as their evaluation becomes more tightly defined and more empirically tractable. My guess though is that a lot of even these more fine-grained sub-causes, might still be much more like causes than interventions in our analysis, insofar as they will still contain heterogeneous groups of interventions and so need to be evaluated more in terms of general characteristics of the set.
  • I agree that if you individuated cause areas in an increasingly fine-grained way, so that each "cause" under consideration was an intervention (e.g. malaria nets in Uganda) or even a specific charity, then the cause/intervention distinction would collapse, in practice.
  1. ^

    Although you could do so for the best single intervention within the cause.

An alternative hypothesis is that less time is being devoted to these kinds of questions (see here and here). 

This potentially has somewhat complex effects, i.e. it's not just that you get fewer novel insights with 100 hours spent thinking than 200 hours spent thinking, but that you get more novel insights from 100 hours spent thinking when doing so against a backdrop of lots of other people thinking and generating ideas in an active intellectual culture.

To be clear, I don't think this totally explains the observation. I also think that it's true, to some extent, that the lowest hanging fruit has been picked, and that this kind of volume probably isn't optimising for weird new ideas. 

Perhaps related to the second point, I also think it may be the case that relatively more recent work  in this area has been 'paradigmatic' rather than 'pre-paradigmatic' or 'crisis stage', which likely generates fewer exciting new insights.

Load more