I work with CE/AIM-incubated charity ARMoR on research distillation, quantitative modelling, consulting, MEL, and general org-boosting to support policies that incentivise innovation and ensure access to antibiotics to help combat AMR. I was previously an AIM Research Program fellow, was supported by a FTX Future Fund regrant and later Open Philanthropy's affected grantees program, and before that I spent 6 years doing data analytics, business intelligence and knowledge + project management in various industries (airlines, e-commerce) and departments (commercial, marketing), after majoring in physics at UCLA and changing my mind about becoming a physicist. I've also initiated some local priorities research efforts, e.g. a charity evaluation initiative with the moonshot aim of reorienting my home country Malaysia's giving landscape towards effectiveness, albeit with mixed results.
I first learned about effective altruism circa 2014 via A Modest Proposal, Scott Alexander's polemic on using dead children as units of currency to force readers to grapple with the opportunity costs of subpar resource allocation under triage. I have never stopped thinking about it since, although my relationship to it has changed quite a bit; I related to Tyler's personal story (which unsurprisingly also references A Modest Proposal as a life-changing polemic):
I thought my own story might be more relatable for friends with a history of devotion – unusual people who’ve found themselves dedicating their lives to a particular moral vision, whether it was (or is) Buddhism, Christianity, social justice, or climate activism. When these visions gobble up all other meaning in the life of their devotees, well, that sucks. I go through my own history of devotion to effective altruism. It’s the story of [wanting to help] turning into [needing to help] turning into [living to help] turning into [wanting to die] turning into [wanting to help again, because helping is part of a rich life].
[Graham’s hierarchy of disagreements] is useful for its intended purpose, but it isn’t really a hierarchy of disagreements. It’s a hierarchy of types of response, within a disagreement. Sometimes things are refutations of other people’s points, but the points should never have been made at all, and refuting them doesn’t help. Sometimes it’s unclear how the argument even connects to the sorts of things that in principle could be proven or refuted.
If we were to classify disagreements themselves – talk about what people are doing when they’re even having an argument – I think it would look something like this:
Most people are either meta-debating – debating whether some parties in the debate are violating norms – or they’re just shaming, trying to push one side of the debate outside the bounds of respectability.
If you can get past that level, you end up discussing facts (blue column on the left) and/or philosophizing about how the argument has to fit together before one side is “right” or “wrong” (red column on the right). Either of these can be anywhere from throwing out a one-line claim and adding “Checkmate, atheists” at the end of it, to cooperating with the other person to try to figure out exactly what considerations are relevant and which sources best resolve them.
If you can get past that level, you run into really high-level disagreements about overall moral systems, or which goods are more valuable than others, or what “freedom” means, or stuff like that. These are basically unresolvable with anything less than a lifetime of philosophical work, but they usually allow mutual understanding and respect.
More on the high-level generators of disagreement (emphasis mine, other than 1st sentence):
High-level generators of disagreement are what remains when everyone understands exactly what’s being argued, and agrees on what all the evidence says, but have vague and hard-to-define reasons for disagreeing anyway. In retrospect, these are probably why the disagreement arose in the first place, with a lot of the more specific points being downstream of them and kind of made-up justifications. These are almost impossible to resolve even in principle. ...
Some of these involve what social signal an action might send; for example, even a just war might have the subtle effect of legitimizing war in people’s minds. Others involve cases where we expect our information to be biased or our analysis to be inaccurate; for example, if past regulations that seemed good have gone wrong, we might expect the next one to go wrong even if we can’t think of arguments against it. Others involve differences in very vague and long-term predictions, like whether it’s reasonable to worry about the government descending into tyranny or anarchy. Others involve fundamentally different moral systems, like if it’s okay to kill someone for a greater good. And the most frustrating involve chaotic and uncomputable situations that have to be solved by metis or phronesis or similar-sounding Greek words, where different people’s Greek words give them different opinions.
You can always try debating these points further. But these sorts of high-level generators are usually formed from hundreds of different cases and can’t easily be simplified or disproven. Maybe the best you can do is share the situations that led to you having the generators you do. Sometimes good art can help.
The high-level generators of disagreement can sound a lot like really bad and stupid arguments from previous levels. “We just have fundamentally different values” can sound a lot like “You’re just an evil person”. “I’ve got a heuristic here based on a lot of other cases I’ve seen” can sound a lot like “I prefer anecdotal evidence to facts”. And “I don’t think we can trust explicit reasoning in an area as fraught as this” can sound a lot like “I hate logic and am going to do whatever my biases say”. If there’s a difference, I think it comes from having gone through all the previous steps – having confirmed that the other person knows as much as you might be intellectual equals who are both equally concerned about doing the moral thing – and realizing that both of you alike are controlled by high-level generators. High-level generators aren’t biases in the sense of mistakes. They’re the strategies everyone uses to guide themselves in uncertain situations.
Regarding your "something clearly rational here that's kinda unintuitive to get a grip on", I think of it as epistemic learned helplessness as a "social safety valve" to the downside risk of believing persuasive arguments that can (potentially catastrophically) harm the believer, cf. Reason as memetic immune disorder.
There's a lot more to the study of disagreement if you're keen, shame it's mostly just one person working on it and they're busy writing a book nowadays.
Thanks for the intriguing pushback, part of why I kept bringing this up over the years was to surface this kind of counterargument, upvoted. Flagging for myself later to look into the evidence base behind
The evidence of "success" he cites only applies to the latter (where "success" is with respect to Brier scores and such), not the former.
because I'd always assumed it was "obviously" the former (wrongly it seems), since the latter seemed non-robust in the sense Dan Luu looked into (cf. "you really have to understand things", which multi-model aggregations are not).
Sporting decision makers made better decisions after they got serious about learning from analytical models of their games, models that often began life as blogosphere passion projects. In this front office, we believe that can happen again, one level up: sports themselves as the model for decision makers trying to improve the outcomes that matter most.
For someone who doesn't care about who wins, what do sports have to offer? High on my list is getting to closely observe people being incredibly (like world-outlier-level) intense about something. I am generally somewhat obsessed with obsession (I think it is a key ingredient in almost every case of someone accomplishing something remarkable). And with sports, you can easily identify which players are in the top-5 in the world at the incredibly competitive things they do; you can safely assume that their level of obsession and competitiveness is beyond what you'll ever be able to wrap your head around; and you can see them in action. ...
What else is good about sports:
I think it's fun when people care so deeply about something so intrinsically meaningless. It means we can enjoy their emotional journeys without all the baggage of whether we're endorsing something "good" or "bad." (My wife also loves this about sports - her thing is watching Last Chance U while crying her eyes out.) My next sports post will be a collection of "heartwarming" links and stories.
There's a lot of sports analysis, and I kind of think sports is to social science what the laboratory is to natural sciences. Sports statistics have high sample sizes, stable environments and are exhaustively captured on video, so it's often possible to actually figure out what's going on. It's therefore unusually easy to form your own judgment about whether someone's analysis is good or bad, and that can have lessons for what patterns to look for on other topics. (My view: academic analysis of sports is often almost unbelievably bad, as you can see from some of the Phil Birnbaum eviscerations, whereas average sportswriting and TV commentating is worse than language can convey. Nerdy but non-academic sports analysis websites like Cleaning the Glass, Football Outsiders and FiveThirtyEight are good.)
the last point of which jives with your blog's thesis.
Curious to see that poll. I'm in that minority too.
But I do wonder how much the "someone's voice is an extension of them" view is mediated by the privilege of being effortlessly able to articulate one's thoughts in public, especially in a forum that invites scrutiny such as this one, and reliably get positive engagement. You and Brad seem to be on opposite ends of this spectrum (?). Your combination of prolificity, quality, and the fact that you do this despite having an obscenely busy day job reminds me of Scott Alexander, cf. this AMA exchange from back when he was a full-time psychiatrist:
How do you write so quickly? I find it takes me a dozen or more hours to write anything as thorough as one of your blog posts. (It's possible that I'm just unusually slow).
Scott: I guess I don't really understand why it takes so many people so long to write. They seem to be able to talk instantaneously, and writing isn't that different from speech. Why can't they just say what they want to say, but instead of speaking it aloud, write it down?
(Yeah, that level of clear thinking to clear writing translation is an insane privilege.)
On the other hand Brad's reply to you reminds me of my younger self. I was horrible at this, and worked my rear off for years to get to essentially the starting point of my more innately articulate peers who sailed through job interviews, scholarship interviews etc I kept bombing out of. You can tell how much I care about this by the fact that I could link to a throwaway comment above from deep within the chat threads of an 8-year old reddit AMA by someone mentioning a thing they had that I didn't. I can definitely see younger me being in the majority of your poll.
I think this has to do with the fact that I think mostly nonverbally, which makes the thought to writing / speech translation much harder. I suspect vast swathes of the population are similar. (The wordcel vs rotator thing is related, although I dislike the discourse around it.) This makes us, relatively speaking, voiceless in public fora, so discourse gets dominated by verbal thinkers which skews the intellectual environment and culture.
Going back and forth with AI, reviewing, and drafting can turn a writing process that might take several days to a week or more, into an hour or two, or less
I went "yeah definitely for nonverbal-ish thinkers, and I think this has the potential to reduce the skew and improve intellectual variety in discourse and culture, and separately I expect verbal-ish thinkers won't appreciate this benefit" and sure enough your reply confirmed the latter.
That said, I do mostly agree with you that I haven't been very impressed by the heavily AI-assisted writings I've seen, and like you I really dislike "AI voice", so to me this has been more potential than realised benefit so far. Some guesses:
I'm wrong about the above
AI isn't good enough yet to properly bridge the translation gap between heavily nonverbal-infused thinking and and writing. (Or it is but people aren't using the paid versions)
Nonverbal-ish thinkers just don't reason as clearly as they think they do, and they never noticed this because unlike verbal-ish thinkers they hadn't often translated their thoughts into writing which exposes thinking gaps, and AI-assisted writeups of their half-baked thoughts fill in those gaps with slop
The written word isn't the right translation target for nonverbal-ish thinking, it's something else, and (more advanced than today's) AI can potentially assist with this too. I'm thinking of Bret Victor's humane dynamic medium, dangit I should've just quoted these sections instead of subjecting you to my rambling:
A way in which people conceive and share thoughts. An idea might be expressed as a speech, a song, a drawing, a video, an essay, an equation, a tweet... These are different media.
Certain media open up new threads of thought that are otherwise inconceivable. Greek drama was made possible by writing; Shakespearean drama was made possible by print; Newtonian physics was made possible by equations.
The deepest effects are realized when a medium is diffused throughout a culture, not in the hands of a select few. A literate society is one in which all people participate in the exchange of written ideas, where the visual organization of words is second nature in the cultural consciousness. Societies with designated scribes do not enjoy the most significant benefits of literacy.
The conceiving and sharing of ideas represented computationally.
Computers can be used for efficiently distributing static media, as when reading an article or watching a video. But by “dynamic medium”, we mean the representation of ideas in which computation is essential, by enabling active exploration of implications and possibilities.
The modern world is shaped by vast complex systems — technical systems, environmental systems, societal systems — which cannot be clearly seen nor deeply understood via non-dynamic media. The dynamic medium may enable humanity to grasp and grapple with this century’s most critical ideas.
A dynamic medium which is communal, gives all people full agency, and is part of the real world. [more]
By “communal”, we mean bringing people together in the same physical space, with a medium that supports and strengthens face-to-face interaction, shared hands-on work, tacit knowledge, mutual context, and generally being present in the same reality.
By “agency”, we mean a person’s ability and confidence to view, change, extend, and remake every aspect of a system that they rely on, especially for fluently exploring new ideas and improvising solutions in unique situations. In the case of computing systems, this implies top-to-bottom programmability and composability, in a form that is accessible and human-scale.
By “real world”, we mean that material in the medium physically exists, and all of our human abilities and human senses can be applied to it. People are free to make use of their whole selves, every feature of their physical body and of the physical world, instead of interacting with a simulation through an interface.
“Real world” also refers to being situated in reality — understanding what’s actually happening and how things actually work instead of just abstractions; awareness of larger contexts — and especially the local reality of local needs and local knowledge rather than top-down centralized mass-produced solutions.
Andres pointed out a sad corollary downstream of people's misinterpretation of regression to the mean as indicating causality when there might be none. From Tversky & Kahneman (1982) via Andrew Gelman:
We normally reinforce others when their behavior is good and punish them when their behavior is bad. By regression alone, therefore, they are most likely to improve after being punished and most likely to deteriorate after being rewarded. Consequently, we are exposed to a lifetime schedule in which we are most often rewarded for punishing others, and punished for rewarding.
This essay's examples and choice of emphasis made me uneasy, despite my wholehearted agreement with the title as stated and with most of the object-level advice ("be hungry, shake complacency, don't get caught up in short-term work incentives" etc). Some scattered reactions:
It feels a bit rude to link to an 80K essay given you work there, but I think of this piece as (maybe unintentionally) encouraging single-player thinking by valorizing individual heroism via choice of examples over the multiplayer mindset that doing good better together requires. Individual intensity doesn't seem to be the binding constraint to solving the world's biggest problems as much as trust / coordination / institutional quality are (alongside good judgment, more below). It's unfortunate that we have plenty of memetic galvanising anecdotes for the former over the latter, maybe "create more content to make multiplayer altruism sexy" should be a cause X, cf. your remark that there are too few stories of the Fred Hollows and Viktor Zhdanovs and they're much less famous than the Jensens and LBJs. It's also unfortunate that the traits mentioned in the anecdotes (Jensen being an asshole, LBJ being a lying manipulator) are memetically more fit than integrity / good character etc, as they're corrosive to the trust foundational to multiplayer altruism
If you buy that effectiveness = judgment x ambition x risk appetite and that the essay's motivating example is a central example, then good judgment arguably beats ambition, even more so on the margin given how undersupplied it is relative to ambition in EA, and doubly so for longtermist work (cf. Holden singling it out as an aptitude, OP struggling with sign uncertainty back then etc). You do mention this cf. misplaced ambition but I think it's a lot harder than "don't do a Jiro" suggests and should be more central to the thesis
Messaging-wise I worry that impressionable younger folks, like me a few years ago, might take away a simplistic maximising vibe from your examples despite all the nuance, which is perilous in a way that's not easy to deeply appreciate unless they've developed good judgment as to why it's perilous. I think this is especially the case with talented driven folks
Ultimately I don't think we disagree on much. Just a bummer that "cooperation-first character-shaped judgment-steered ambition" has no chance of catching on vs "be more ambitious"...
I was going to link to the 2011 GiveWell blog post by Holden Karnofsky arguing against taking EV estimates literally, but I see Alex Berger has already mentioned it above. I'd call out these passages in particular to save folks the effort of clicking through:
While some people feel that GiveWell puts too much emphasis on the measurable and quantifiable, there are others who go further than we do in quantification, and justify their giving (or other) decisions based on fully explicit expected-value formulas. The latter group tends to critique us – or at least disagree with us – based on our preference for strong evidence over high apparent “expected value,” and based on the heavy role of non-formalized intuition in our decisionmaking. This post is directed at the latter group.
We believe that people in this group are often making a fundamental mistake, one that we have long had intuitive objections to but have recently developed a more formal (though still fairly rough) critique of. The mistake (we believe) is estimating the “expected value” of a donation (or other action) based solely on a fully explicit, quantified formula, many of whose inputs are guesses or very rough estimates. We believe that any estimate along these lines needs to be adjusted using a “Bayesian prior”; that this adjustment can rarely be made (reasonably) using an explicit, formal calculation; and that most attempts to do the latter, even when they seem to be making very conservative downward adjustments to the expected value of an opportunity, are not making nearly large enough downward adjustments to be consistent with the proper Bayesian approach.
This view of ours illustrates why – while we seek to ground our recommendations in relevant facts, calculations and quantifications to the extent possible – every recommendation we make incorporates many different forms of evidence and involves a strong dose of intuition. And we generally prefer to give where we have strong evidence that donations can do a lot of good rather than where we have weak evidence that donations can do far more good – a preference that I believe is inconsistent with the approach of giving based on explicit expected-value formulas (at least those that (a) have significant room for error (b) do not incorporate Bayesian adjustments, which are very rare in these analyses and very difficult to do both formally and reasonably).
Sequence thinking involves making a decision based on a single model of the world: breaking down the decision into a set of key questions, taking one’s best guess on each question, and accepting the conclusion that is implied by the set of best guesses (an excellent example of this sort of thinking is Robin Hanson’s discussion of cryonics). It has the form: “A, and B, and C … and N; therefore X.” Sequence thinking has the advantage of making one’s assumptions and beliefs highly transparent, and as such it is often associated with finding ways to make counterintuitive comparisons.
Cluster thinking – generally the more common kind of thinking – involves approaching a decision from multiple perspectives (which might also be called “mental models”), observing which decision would be implied by each perspective, and weighing the perspectives in order to arrive at a final decision. Cluster thinking has the form: “Perspective 1 implies X; perspective 2 implies not-X; perspective 3 implies X; … therefore, weighing these different perspectives and taking into account how much uncertainty I have about each, X.” Each perspective might represent a relatively crude or limited pattern-match (e.g., “This plan seems similar to other plans that have had bad results”), or a highly complex model; the different perspectives are combined by weighing their conclusions against each other, rather than by constructing a single unified model that tries to account for all available information.
A key difference with “sequence thinking” is the handling of certainty/robustness (by which I mean the opposite of Knightian uncertainty) associated with each perspective. Perspectives associated with high uncertainty are in some sense “sandboxed” in cluster thinking: they are stopped from carrying strong weight in the final decision, even when such perspectives involve extreme claims (e.g., a low-certainty argument that “animal welfare is 100,000x as promising a cause as global poverty” receives no more weight than if it were an argument that “animal welfare is 10x as promising a cause as global poverty”).
Finally, cluster thinking is often (though not necessarily) associated with what I call “regression to normality”: the stranger and more unusual the action-relevant implications of a perspective, the higher the bar for taking it seriously (“extraordinary claims require extraordinary evidence”).
... I don’t believe that either style of thinking fully matches my best model of the “theoretically ideal” way to combine beliefs (more below); each can be seen as a more intellectually tractable approximation to this ideal.
I believe that each style of thinking has advantages relative to the other. I see sequence thinking as being highly useful for idea generation, brainstorming, reflection, and discussion, due to the way in which it makes assumptions explicit, allows extreme factors to carry extreme weight and generate surprising conclusions, and resists “regression to normality.” However, I see cluster thinking as superior in its tendency to reach good conclusions about which action (from a given set of options) should be taken.
... Sequence thinking presumes a particular framework for thinking about the consequences of one’s actions. It may incorporate many considerations, but all are translated into a single language, a single mental model, and in some sense a single “formula.” I believe this is at odds with how successful prediction systems operate, whether in finance, software, or domains such as political forecasting; such systems generally combine the predictions of multiple models in ways that purposefully avoid letting any one model (especially a low-certainty one) carry too much weight when it contradicts the others. On this point, I find Nate Silver’s discussion of his own system and the relationship to the work of Philip Tetlock (and the related concept of foxes vs. hedgehogs) germane
While the post is over a decade old it still seems foundational to how GiveWell think about their CEAs:
Cost-effectiveness is the single most important input in our evaluation of a program's impact. However, there are many limitations to cost-effectiveness estimates, and we do not assess programs solely based on their estimated cost-effectiveness.
I think of cluster thinking-based intervention ranking as better than the sequence thinking-plus-bayesian correction approach you explored above to account for the optimiser's curse for these reasons, especially the observation that successful prediction systems across most domains use cluster not sequence thinking.
My go-to diagram for illustrating your point, from (who else?) Scott Alexander's varieties of argumentative experience:
More on the high-level generators of disagreement (emphasis mine, other than 1st sentence):
(also related: Value Differences As Differently Crystallized Metaphysical Heuristics and the previous essays in that series)
Regarding your "something clearly rational here that's kinda unintuitive to get a grip on", I think of it as epistemic learned helplessness as a "social safety valve" to the downside risk of believing persuasive arguments that can (potentially catastrophically) harm the believer, cf. Reason as memetic immune disorder.
There's a lot more to the study of disagreement if you're keen, shame it's mostly just one person working on it and they're busy writing a book nowadays.