DM

David_Moss

Principal Research Director @ Rethink Priorities
8154 karmaJoined Working (6-15 years)

Bio

I am the Principal Research Director at Rethink Priorities. I lead our Surveys and Data Analysis department and our Worldview Investigation Team. 

The Worldview Investigation Team previously completed the Moral Weight Project and CURVE Sequence / Cross-Cause Model. We're currently working on tools to help EAs decide how they should allocate resources within portfolios of different causes, and to how to use a moral parliament approach to allocate resources given metanormative uncertainty.

The Surveys and Data Analysis Team primarily works on private commissions for core EA movement and longtermist orgs, where we provide:

  • Private polling to assess public attitudes
  • Message testing / framing experiments, testing online ads
  • Expert surveys
  • Private data analyses and survey / analysis consultation
  • Impact assessments of orgs/programs

Formerly, I also managed our Wild Animal Welfare department and I've previously worked for Charity Science, and been a trustee at Charity Entrepreneurship and EA London.

My academic interests are in moral psychology and methodology at the intersection of psychology and philosophy.

How I can help others

Survey methodology and data analysis.

Sequences
3

RP US Public AI Attitudes Surveys
EA Survey 2022
EA Survey 2020

Comments
567

I was indeed trying to say option a - that There's a "bias towards animals relative to other cause areas," . Yes I agree it would be ideal to have people on different sides of debates in these kind of teams but that's often impractical and not my point here.

 

Thanks for clarifying!

  • Re. being biased in favour of animal welfare relative to other causes: I feel at least moderately confident that this is not the case. As the person overseeing the team I would be very concerned if I thought this was the case. But it doesn't match my experience of the team being equally happy to work on other cause areas, which is why we spent significant time proposing work across cause areas, and being primarily interested in addressing fundamental questions about how we can best allocate resources.[1] 
  • I am much more sympathetic to the second concern I outlined (which you say is not your concern): we might not be biased in favour of one cause area against another, but we still might lack people on both extremes of all key debates. Both of us seem to agree this is probably inevitable (one reason: EA is heavily skewed towards people who endorse certain positions, as we have argued here, which is a reason to be sceptical of our conclusions and probe the implications of different assumptions).[2] 

Some broader points:

  • I think that it's more productive to focus on evaluating our substantive arguments (to see if they are correct or incorrect) than trying to identify markers of potential latent bias.
  • Our resource allocation work is deliberately framed in terms of open frameworks which allow people to explore the implications of their own assumptions.
     
  1. ^

    And if the members of the team wanted to work solely on animal causes (in a different position), I think they'd all be well-placed to do so. 

  2. ^

    That said, I don't think we do too badly here, even in the context of AW specifically, e.g. Bob Fischer has previously published on hierarchicalism, the view that humans matter more than other animals).

One possible way of thinking about this, which might tie your work in smaller battles into a 'big picture', is if you believe that your work on the smaller battles is indirectly helping the wider project. e.g. by working to solve one altruistic cause you are sparing other altruistic individuals and altruistic resources from being spent on that cause, increasing the resources available for wider altruistic projects, and potentially by increasing altruistic resources available in the future.[1]

Note that I'm only saying this is a possible way of thinking about this, not necessarily that you should think this (for one thing, the extent to which this is true probably varies across areas, depending on the inter-connectedness of different cause areas in different ways and their varying flowthrough effects).

  1. ^

    As in this passage from one of Yudkowsky's short stories:

    "But time passed," the Confessor said, "time moved forward, and things changed."  The eyes were no longer focused on Akon, looking now at something far away.  "There was an old saying, to the effect that while someone with a single bee sting will pay much for a remedy, to someone with five bee stings, removing just one sting seems less attractive.  That was humanity in the ancient days.  There was so much wrong with the world that the small resources of altruism were splintered among ten thousand urgent charities, and none of it ever seemed to go anywhere.  And yet... and yet..."

    "There was a threshold crossed somewhere," said the Confessor, "without a single apocalypse to mark it.  Fewer wars.  Less starvation.  Better technology.  The economy kept growing.  People had more resource to spare for charity, and the altruists had fewer and fewer causes to choose from.  They came even to me, in my time, and rescued me.  Earth cleaned itself up, and whenever something threatened to go drastically wrong again, the whole attention of the planet turned in that direction and took care of it.  Humanity finally got its act together."

4 out of 5 of the team members worked publically (googlably) to a greater or lesser extent on animal welfare issues even before joining RP


I think this risks being misleading, because the team have also worked on many non-animal related topics. And it's not surprising that they have, because AW is one of the key cause areas of EA, just as it's not surprising they've worked on other core EA areas. So pointing out that the team have worked on animal-related topics seems like cherry-picking, when you could equally well point to work in other areas as evidence of bias in those directions.

For example, Derek has worked on animal topics, but also digital consciousness, with philosophy of mind being a unifying theme.

I can give a more detailed response regarding my own work specifically, since I track all my projects directly. In the last 3 years, 112/124 (90.3%)[1] of the projects I've worked on personally have been EA Meta / Longtermist related, with <10% animal related. But I think it would be a mistake to conclude from this that I'm longtermist-biased, even though that constitutes a larger proportion of my work.

Edit: I realise an alternative way to cash out your concern might not be in terms of bias towards animals relative to other cause areas, but rather than we should have people on both sides of all the key cause areas or key debates (e.g. we should have people on both extreme of being pro- and anti- animal, pro- and anti-AI, pro- and anti- GHD, and also presumably on other key questions like suffering focus etc). 

If so, then I agree this would be desirable as an ideal, but (as you suggest) impractical (and perhaps undesirable) to achieve in a small team.

  1. ^

    This is within RP projects, if we included non-RP academic projects, the proportion of animal projects would be even lower.

Thanks Nick!
 

My reservation is when im the research will end up being somewhat biased towards animal welfare, considering that has been a major research focus and passion for most of these researchers for a long time.

This seems over-stated to me. 

WIT is a cross-cause team in a cross-cause organization. The Animal Moral Weight project is one project we've worked on, but all our other subsequent projects (the Cross-Cause Model, the Portfolio Tool, the Moral Parliament Tool, the Digital Consciousness Model, and our OpenAI project on Risk Alignment in Agentic AI) are all not specifically animal related. We've also elsewhere proposed work looking at human moral weight and movement building.

You previously suggested that the team who worked on the Moral Weight project was skewed towards people who would be friendly to animals (though the majority of the staff on the team were animal scientists of one kind or another). But all the researchers on the current WIT team (aside from Bob himself) were hired after the completion of the Moral Weight project. In addition, I personally oversee the team and have worked on a variety of cause areas.

Also, regarding interpreting our resource allocation projects: the key animal-related input to these are the moral weight scores. And our tools purposefully give users the option to adjust these in line with their own views themselves.

EA already skews somewhat young, but from the last EA community survey it looks like the average age was around 29. So I wonder why are the vast majority of people doing mentorship/coaching/career advising are younger than that? Maybe the older people involved in EA are disproportionality not employed for EA organizations and are thus less focused on funneling people into impactful careers?

 

I checked and people who currently work in an EA org are only slightly older on average (median 29 vs median 28).

If true, it could mean that any theory framed in opposition, such as a critique of Shorttermism or Longtermism, might be more appealing than the time-focused theory itself. Critizising short-term thinking is an applause light in many circles.

 

I agree this could well be true at the level of arguments i.e. I think there are probably longtermist (anti-shorttermist), framings which would be successful. But I suspect it would be harder to make this work at the level of framing/branding a whole movement, i.e. I think promoting the 'anti-shorttermist' movement would be hard to do successfully.

It takes a significant amount of time to mark a test task. But this can be fixed by just adjusting the height of the screening bar, as opposed to using credentialist and biased methods (like looking at someone's LinkedIn profile or CV). 

 

Whether or not to use "credentialist and biased methods (like looking at someone's LinkedIn profile or CV)" seems orthogonal to the discussion at hand? 

The key issue seems to be that if you raise the screening bar, then you would be admitting fewer applicants to the task (the opposite of the original intention).

This is an empirical question, and I suspect is not true. For example, it took me 10 minutes to mark each candidates 1 hour test task. So my salary would need to be 6* higher (per unit time) than the test task payment for this to be true. 

This will definitely vary by org and by task. But many EA orgs report valuing their staff's time extremely highly. And my impression is that both grading longer tasks and then processing the additional applicants (many orgs will also feel compelled to offer at least some feedback if a candidate has completed a multi-hour task) will often take much longer than 10 minutes total.

Orgs can continue to pay top candidates to complete the test task, if they believe it measurably decreases the attrition rate, but give all candidates that pass an anonymised screening bar the chance to complete a test task.

 

My guess is that, for many orgs, the time cost of assessing the test task is larger than the financial cost of paying candidates to complete the test task, and that significant reasons for wanting to compensate applicants are (i) a sense of justice, (ii) wanting to avoid the appearance of unreasonably demanding lots of unpaid labour from applicants, not just wanting to encourage applicants to complete the tasks[1].

So I agree that there are good reasons for wanting more people to be able to complete test tasks. But I think that doing so would potentially significantly increase costs to orgs, and that not compensating applicants would reduce costs to orgs by less than one might imagine.

I also think the justice-implications of compensating applicants are unclear (offering pay for longer tasks may make them more accessible to poorer applicants)

  1. ^

    I think that many applicants are highly motivated to complete tasks, in order to have a chance of getting the job.

I guess it depends on the specifics of the situation, but, to me, the case described, of a board member making one or two incorrect claims (in a comment that presumably also had a bunch of accurate and helpful content) that they needed to walk back sounds… not that bad? Like, it seems only marginally worse than their comment being fully accurate the first time round...

 

I agree that it depends on the situation, but I think this would often be quite a lot worse in real, non-ideal situations. In ideal communicative situations, mistaken information can simply be corrected at minimal cost. But in non-ideal situations, I think one will often see things like:

  • Mistaken information gets shared and people spend time debating or being confused about the false information
  • Many people never notice or forget that the mistaken information got corrected and it keeps getting believed and shared
  • Some people speculate that the mistaken claims weren't innocently shared, but that the board member was being evasive/dishonest
  • People conclude that the organization / board is incompetent and chaotic because they can't even get basic facts right

Fwiw, I think different views about this ideal/non-ideal distinction underlie a lot of disagreements about communicative norms in EA.

Thanks Ben!

I don't think there's a single way to interpret the magnitude of the differences or the absolute scores (e.g a single effect size), so it's best to examine this in a number of different ways.

One way to interpret the difference between the ratings is to look at the probability of superiority scores. For example, for Study 3 we showed that ~78% of people would be expected to rate longtermism AI safety (6.00) higher than longtermism (4.75). In contrast, for AI safety vs effective giving (5.65), it's 61%, and for GCRR (5.95) it's only about 51%.

You can also examine the (raw and weighted) distributions of the responses. This allows one to assess directly how many people "Like a great deal", "Dislike a great deal" and so on. 

You can also look at different measures, which have a more concrete interpretation than liking. We did this with one (interest in hearing more information about a topic). But in future studies we'll include additional concrete measures, so we know e.g. how many people say they would get involved with x movement.

I agree that comparing these responses to other similar things outside of EA (like "positive action" but on the negative side) would be another useful way to compare the meaning of these responses.

One other thing to add is that the design of these studies isn't optimised for assessing the effect of different names in absolute terms, because we every subject evaluated every item ("within-subjects"). This allows greater statistical power more cheaply, but the evaluations are also more likely to be implicitly comparative. To get an estimate of something like the difference in number of people who would be interested in x rather than y (assuming they would only encounter one or the other in the wild at a single time), we'd want to use a between-subjects design where people only evaluate one and indicate their interest in it.

 

 

 


 

Load more