Bio

Feedback welcome: www.admonymous.co/mo-putera 

I work with CE/AIM-incubated charity ARMoR on research distillation, quantitative modelling, consulting, MEL, and general org-boosting to support policies that incentivise innovation and ensure access to antibiotics to help combat AMR. I was previously an AIM Research Program fellow, was supported by a FTX Future Fund regrant and later Open Philanthropy's affected grantees program, and before that I spent 6 years doing data analytics, business intelligence and knowledge + project management in various industries (airlines, e-commerce) and departments (commercial, marketing), after majoring in physics at UCLA and changing my mind about becoming a physicist. I've also initiated some local priorities research efforts, e.g. a charity evaluation initiative with the moonshot aim of reorienting my home country Malaysia's giving landscape towards effectiveness, albeit with mixed results. 

I first learned about effective altruism circa 2014 via A Modest Proposal, Scott Alexander's polemic on using dead children as units of currency to force readers to grapple with the opportunity costs of subpar resource allocation under triage. I have never stopped thinking about it since, although my relationship to it has changed quite a bit; I related to Tyler's personal story (which unsurprisingly also references A Modest Proposal as a life-changing polemic):

I thought my own story might be more relatable for friends with a history of devotion – unusual people who’ve found themselves dedicating their lives to a particular moral vision, whether it was (or is) Buddhism, Christianity, social justice, or climate activism. When these visions gobble up all other meaning in the life of their devotees, well, that sucks. I go through my own history of devotion to effective altruism. It’s the story of [wanting to help] turning into [needing to help] turning into [living to help] turning into [wanting to die] turning into [wanting to help again, because helping is part of a rich life].

Comments
385

Topic contributions
3

What do you think of Manheim's simple explanation for what makes good technologies good?

David Manheim's If AI is normal technology, history is not reassuring is a good read (emphasis mine): 

There’s a truism that technology is good - even if it creates winners and losers, it improves the world. Toby Ord argues that the conclusions about the benefits of technology is sensitive to the end of humanity - but this jumps over the transitions by starting from the assumption[1] that “long-term progress in science, technology, and values have tended to make people’s lives longer, freer, and more prosperous.” That is, looking back historically, the net impact misses the immense immediate harms of large scale technological changes that can last for generations.

As I’ll explain, the largest technological revolutions in human history are arguably the agricultural revolution and the industrial revolution. In both cases, the vast majority of those immediately affected were harmed, not helped. Of course, the longer term impact was positive; those benefits are not in question[2] - not that those alive during the transition should have cared.

The two obvious examples

The invention of agriculture led to increased food availability and around ten thousand years of greatly worsened health and lifespans[3]. The wealthiest and most powerful people benefited immensely from the population explosion, and from the wars that larger populations enabled and required; the population suffered from both malnutrition, and that same increase in the scale of violence[4].

The invention of industry was more beneficial to the consumer - but not to those directly involved. In 1840, over a third of the British population worked in a factory. This was bad, in part directly due to factory worker deaths, but also due to pollution and disease. Mortality shot up over the middle of the 1800s - the famed “urban penalty”, especially among children, albeit partially offset by reduced deaths because of sanitation later in the century[5]

3 more examples via ChatGPT/Manheim (which provided 5 including the 2 above; I omitted them -- again emphasis mine):

  1. Writing and external symbolic storage - Administration, law, history, mathematics, scripture, bureaucracy, long-distance coordination. Early writing mostly helped palaces, temples, tax systems, accounting, property claims, labor control, and bureaucracy before it helped ordinary people read novels or do science. So the near-term “users” benefited, but many affected subjects may have faced more legible extraction and administration. Evidence from early Mesopotamia links writing with larger government buildings and multi-level bureaucracies.
  2. Metallurgy, especially iron - Tools, weapons, plows, empires, deforestation, intensified agriculture, military expansion. Bronze matters too, but iron’s scale and availability make it more transformative. Better tools helped agriculture and craft production, but weapons, fortifications, conquest, inequality, and elite control plausibly dominated early experience for many. The case is less clean because metal tools also had immediate productive benefits, but the war-and-hierarchy channel is very real.
  3. Electricity + computation + telecommunications - I’d bundle these reluctantly as the “information-electrical stack”: telegraph, telephone, radio, electric grids, computers, internet, AI. This led to surveillance, labor displacement, attention capture, military command/control, financial acceleration, and dependence on fragile networks[8].

Manheim's argument for what makes good technologies good:

There have been a couple of revolutionary changes in medicine and public health over the past couple centuries. The vaccine revolution, the advent of modern sanitation, and infection control each include a strong case that they were immediately beneficial, and stayed that way indefinitely[9]. Refrigeration, washing machines, and bicycles[10] are arguably more examples in this class. So some technologies really are just positive - but we need to ask which ones.

I think there’s a simple explanation; directly good things are good, but many other classes of transformative change end up disruptive in ways that hurt before they can help[11]. Technologies that have first order impacts on coordination and production, or that empower groups in other ways, tend to differentially benefit the powerful in ways that are harmful to others, either directly or indirectly[12].

I find myself instinctively resisting Manheim's explanation, since I'm generally keen on improvements to empowerment and coordination, but have to admit it parsimoniously explains the small-n historical track record above. The issue, as always, seems to be the gap between the beautiful ideal ("more empowerment! more coordination!") and the unavoidably-messy realities of implementation, people being people, etc.

Tangentially, this reminds me a bit of Holden Karnofsky's maximally-conservative utopia, which is just "status quo minus clearly-bad things". 

This isn't really a utopia in the traditional sense. It's trying to lay out one end of a spectrum.

Start here:

In this world, everything is exactly like the status quo, with one exception: cancer does not exist.

It may not be very exciting, but it's hard to argue with the claim that this would be better than the world as it is today.

This is basically the most conservative utopia I can come up with, because the only change it proposes is a change that I think we can all get on board with, without hesitation. Most proposed changes to the world would make at least some people uncomfortable (no inequality? No sadness?), but this one shouldn't. If we got rid of cancer, we'd still have death, we'd still have suffering, we'd still have struggle, etc. - we just wouldn't have cancer.

You can almost certainly improve this utopia further by taking more baby-steps along the same lines. Make a list of things that - like cancer - you think are just unambiguously bad, and would be happy to see no more of in the world. Then define utopia as "exactly like the status quo, except that all the things on my list don't exist." Examples could include:

  • Other diseases
  • Hunger
  • Non-consensual violence (not including e.g. martial arts, in which two people agree to a set of rules that allows specific forms of violence for a set period of time).
  • Racism, sexism, etc.

"Status quo, minus everything on my list" is a highly conservative utopia. Unlike literary utopias, it should be fairly clear that this world would be a major improvement on the world as it is.

I note that in my survey on fictional utopias, it was much easier to get widespread agreement (high average scores) for properties of utopia than for full utopian visions. For example, while no utopia description scored as high as 4 on a 5-point scale, the following properties all scored 4.5 or higher: "no one goes hungry", "there is no violent conflict," "there is no discrimination by race or gender."

Megaprojects for animals (or an updated version perhaps, this list being from 2022) seems more pertinent than ever. 

Your experience reminded me of how Holden Karnofsky described his career so far:

The general theme of my career is just taking questions, especially questions about how to give effectively, where it's just like no one's really gotten started on this question. Even doing a pretty crappy analysis can be better than what already exists. So often what I have done in my career, what I consider myself to have kind of specialized in, in a sense, is I do the first cut crappy analysis of some question that has not been analyzed much and is very important. Then I build a team to do better analysis of that question. That's been my general pattern. I think that's the most generalizable skill I've had

Do you reassess more frequently than annually or biannually, which is my impression of the Schelling point frequency for most folks?

Likely the standard definition, e.g. by WHO 

Intimate partner violence refers to behaviour within an intimate relationship that causes physical, sexual or psychological harm, including acts of physical aggression, sexual coercion, psychological abuse and controlling behaviours. This definition covers violence by both current and former spouses and partners.

It's a subset of gender-based violence. See cause area overview, TLYCS's help women and girls fund, CE charity NOVAH, etc.

Having followed a lot of AI benchmarks over the years, my main heuristic takeaway regarding expert-parity claims is "prepare to be disappointed once you dig in", alongside "but they were still useful in advancing understanding and progress", cf. SemiAnalysis' Benchmarks are bad but we need to keep using them anyways section for an outside-of-EA perspective. I'm also less bullish on long-range poor-feedback loops superforecasting more generally for reasons along the lines of superforecaster Eli Lifland's takes (esp. #2 and #4), Dan Luu's appendix notes and comparisons to the actually-accurate futurists his review found, nostalgebraist on metaculus badness, etc which collectively reduce my enthusiasm for automating this.

By empirical evidence I meant anything empirical at all, including things like emergent misalignment and what might come out of Jacob Steinhardt's interpretability program and what Ryan Greenblatt says here and whatever the right value-analogue of Anthropic's functional emotions paper is (below) and so on, not just observable behavior. Maybe I'm conflating things or overloading "empirical", in which case my apologies.

image.png

Regarding the sharp left turn, Byrnes' opinionated review is the best argument for worrying about this that I'm aware of, but he isn't talking about today's LLMs and their descendants, which rules out your last paragraph's pointer to current work. Roger Dearnaley's intuition pump behind his take that the sharp left turn might not be as hopeless as it seems is resonant with me, but his description seems vibes-based so I can't tell if he's misunderstanding the sharp left turn. I do think Dearnaley's personal "full-stack" attempt at assessing alignment progress is the sort of answer I'd want to your question re: what sort of work would be good evidence, although my impression is you disagree for high-level generator reasons that would be ~intractable to resolve within the margins of EA forum comments... 

Load more