November 2022 update: I wrote this post during a difficult period in my life. I still agree with the basic point I was gesturing towards, but regret some of the presentation decisions I made. I may make another attempt in the future.
"A system that ignores feedback has already begun the process of terminal instability."
– John Gall, Systemantics
(My request from last time still stands.)
jimrandomh wrote a great comment in response to my last post:
The core thesis here seems to be:
"I claim that [cluster of organizations] have collectively decided that they do not need to participate in tight feedback loops with reality in order to have a huge, positive impact."
There are different ways of unpacking this, so before I respond I want to disambiguate them. Here are four different unpackings:
- Tight feedback loops are important, [cluster of organizations] could be doing a better job creating them, and this is a priority. (I agree with this. Reality doesn't grade on a curve.)
- Tight feedback loops are important, and [cluster of organizations] is doing a bad job of creating them, relative to organizations in the same reference class. (I disagree with this. If graded on a curve, we're doing pretty well. )
- Tight feedback loops are important, but [cluster of organizations] has concluded in their explicit verbal reasoning that they aren't important. (I am very confident that this is false for at least some of the organizations named, where I have visibility into the thinking of decision makers involved.)
- Tight feedback loops are important, but [cluster of organizations] is implicitly deprioritizing and avoiding them, by ignoring/forgetting discouraging information, and by incentivizing positive narratives over truthful narratives.
(4) is the interesting version of this claim, and I think there's some truth to it. I also think that this problem is much more widespread than just our own community, and fixing it is likely one of the core bottlenecks for civilization as a whole.
I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they're doing the wrong thing, their anticipations put a lot of weight on the possibility that they'll be shamed and punished, and not much weight on the possibility that they'll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who've internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.
...
I replied:
Thank you for this thoughtful reply! I appreciate it, and the disambiguation is helpful. (I would personally like to do as much thinking-in-public about this stuff as seems feasible.)
I mean a combination of (1) and (4).I used to not believe that (4) was a thing, but then I started to notice (usually unconscious) patterns of (4) behavior arising in me, and as I investigated further I kept noticing more & more (4) behavior in me, so now I think it's really a thing (because I don't believe that I'm an outlier in this regard).
...
I agree with jimrandomh that (4) is the most interesting version of this claim. What would it look like if the cluster of EA & Rationality organizations I pointed to last time were implicitly deprioritizing getting feedback from reality?
I don't have a crisp articulation of this yet, so here are some examples that seem to me to gesture in that direction:
- Giving What We Can focusing on the number of pledges signed rather than on the amount of money donated by pledge-signers (or better yet, on the impact those donations have had on projects out in the world).
- Founders Pledge focusing on the amount of money pledged and the amount of money donated, rather than on the impact those donations have had out in the world.
- The Against Malaria Foundation and GiveWell focusing on the number of mosquito nets distributed, rather than on the change in malaria incidence in the regions where they have distributed nets.
- 80,000 Hours tracking the number of advising calls they make and the number of career plan changes they catalyze, rather than the long-run impacts their advisees are having in the world.
- It's interesting to compare how 80,000 Hours and Emergent Ventures assess their impact; also 80,000 Hours-style career coaching would plausibly be much more effective if it were coupled with small grants to support advisee exploration. (This would be more of an incubator model.)
- CFAR not tracking workshop participant outcomes in a standardized way over time.
- The Open Philanthropy Project re-granting funds to community members on the basis of reputation (1, 2), rather than on the basis of a track record of effectively deploying capital or on the basis of having concrete, specific plans.
- We could also consider the difficulties that EA Funds had re: deploying capital a few years ago, though as far as I know that situation has improved somewhat in the last couple of years (thanks in large part to the heroic efforts of a few individuals).
Please don't misunderstand – I'm not suggesting that the people involved in these examples are doing anything wrong. I don't think that they are behaving malevolently. The situation seems to me to be more systemic: capable, well-intentioned people begin participating in an equilibrium wherein the incentives of the system encourage drift away from reality.
There are a lot of feedback loops in the examples I list above... but those loops don't seem to connect back to reality, to the actual situation on the ground. Instead, they seem to spiral upwards – metrics tracking opinions, metrics tracking the decisions & beliefs of other people in the community. Goodhart's Law neatly sums up the problem.
Why does this happen? Why do capable, well-intentioned people get sucked into equilibria that are deeply, obviously strange?
Let's revisit this part of jimrandomh's great comment:
I think part of the problem is that people get triggered into defensiveness; when they mentally simulate (or emotionally half-simulate) setting up a feedback mechanism, if that feedback mechanism tells them they're doing the wrong thing, their anticipations put a lot of weight on the possibility that they'll be shamed and punished, and not much weight on the possibility that they'll be able to switch to something else that works better. I think these anticipations are mostly wrong; in my anecdotal observation, the actual reaction organizations get to poor results followed by a pivot is usually at least positive about the pivot, at least from the people who matter. But getting people who've internalized a prediction of doom and shame to surface those models, and do things that would make the outcome legible, is very hard.
I don't have a full articulation yet, but I think this starts to get at it. The strange equilibria fulfill a real emotional need for the people who are attracted to them (see Core Transformation for discussion of one approach towards developing an alternative basis for meeting this need).
And from within an equilibrium like this, pointing out the dynamics by which it maintains homeostasis is often perceived as an attack...
Just a quick comment that I don't think the above is a good characterisation of how 80k assesses its impact. Describing our whole impact evaluation would take a while, but some key elements are:
We think impact is heavy tailed, so we try to identify the most high-impact 'top plan changes'. We do case studies of what impact they had and how we helped. This often involves interviewing the person, and also people who can assess their work. (Last year these interviews were done by a third party to reduce desirability bias). We then do a rough fermi estimate of the impact.
We also track the number of a wider class of 'criteria-based plan changes', but then take a random sample and make fermi estimates of impact so we can compare their value to the top plan changes.
If we had to choose a single metric, it would be something closer to impact-adjusted years of extra labour added to top causes, rather than the sheer number of plan changes.
We also look at other indicators like:
There have been other surveys of the highest-impact people who entered EA in recent years, evaluating which fraction came from 80k, which let's us make an estimate of the percentage of the EA workforce from 80k.
We look at the EA survey results, which let's us track things like how many people are working at EA orgs and entered via 80k.
We use number of calls as a lead metric, not an impact metric. Technically it's the number of calls with people who made an application above a quality bar, rather than the raw number. We've checked and it seems to be a proxy for the number of impact-adjusted plan changes that result from advising.
This is not to deny that assessing our impact is extremely difficult, and ultimately involves a lot of judgement calls - we were explicit about that in the last review - but we've put a lot more work into it than the above implies – probably around 5-10% of team time in recent years.
I think similar comments could be made by several of the other examples e.g. GWWC also tracks dollars donated each year to effective charities (now via the EA Funds) and total dollars pledged. They track the number of pledges as well since that's a better proxy for the community building benefits.
I feel like I'm asking about something pretty simple. Here's a sketch:
GiveWell basically does this for its top charities.