Feedback welcome: www.admonymous.co/mo-putera
I work with CE/AIM-incubated charity ARMoR on research distillation, quantitative modelling, consulting, MEL, and general org-boosting to support policies that incentivise innovation and ensure access to antibiotics to help combat AMR. I was previously an AIM Research Program fellow, was supported by a FTX Future Fund regrant and later Open Philanthropy's affected grantees program, and before that I spent 6 years doing data analytics, business intelligence and knowledge + project management in various industries (airlines, e-commerce) and departments (commercial, marketing), after majoring in physics at UCLA and changing my mind about becoming a physicist. I've also initiated some local priorities research efforts, e.g. a charity evaluation initiative with the moonshot aim of reorienting my home country Malaysia's giving landscape towards effectiveness, albeit with mixed results.
I first learned about effective altruism circa 2014 via A Modest Proposal, Scott Alexander's polemic on using dead children as units of currency to force readers to grapple with the opportunity costs of subpar resource allocation under triage. I have never stopped thinking about it since, although my relationship to it has changed quite a bit; I related to Tyler's personal story (which unsurprisingly also references A Modest Proposal as a life-changing polemic):
I thought my own story might be more relatable for friends with a history of devotion – unusual people who’ve found themselves dedicating their lives to a particular moral vision, whether it was (or is) Buddhism, Christianity, social justice, or climate activism. When these visions gobble up all other meaning in the life of their devotees, well, that sucks. I go through my own history of devotion to effective altruism. It’s the story of [wanting to help] turning into [needing to help] turning into [living to help] turning into [wanting to die] turning into [wanting to help again, because helping is part of a rich life].
Likely the standard definition, e.g. by WHO
Intimate partner violence refers to behaviour within an intimate relationship that causes physical, sexual or psychological harm, including acts of physical aggression, sexual coercion, psychological abuse and controlling behaviours. This definition covers violence by both current and former spouses and partners.
It's a subset of gender-based violence. See cause area overview, TLYCS's help women and girls fund, CE charity NOVAH, etc.
Having followed a lot of AI benchmarks over the years, my main heuristic takeaway regarding expert-parity claims is "prepare to be disappointed once you dig in", alongside "but they were still useful in advancing understanding and progress", cf. SemiAnalysis' Benchmarks are bad but we need to keep using them anyways section for an outside-of-EA perspective. I'm also less bullish on long-range poor-feedback loops superforecasting more generally for reasons along the lines of superforecaster Eli Lifland's takes (esp. #2 and #4), Dan Luu's appendix notes and comparisons to the actually-accurate futurists his review found, nostalgebraist on metaculus badness, etc which collectively reduce my enthusiasm for automating this.
By empirical evidence I meant anything empirical at all, including things like emergent misalignment and what might come out of Jacob Steinhardt's interpretability program and what Ryan Greenblatt says here and whatever the right value-analogue of Anthropic's functional emotions paper is (below) and so on, not just observable behavior. Maybe I'm conflating things or overloading "empirical", in which case my apologies.

Regarding the sharp left turn, Byrnes' opinionated review is the best argument for worrying about this that I'm aware of, but he isn't talking about today's LLMs and their descendants, which rules out your last paragraph's pointer to current work. Roger Dearnaley's intuition pump behind his take that the sharp left turn might not be as hopeless as it seems is resonant with me, but his description seems vibes-based so I can't tell if he's misunderstanding the sharp left turn. I do think Dearnaley's personal "full-stack" attempt at assessing alignment progress is the sort of answer I'd want to your question re: what sort of work would be good evidence, although my impression is you disagree for high-level generator reasons that would be ~intractable to resolve within the margins of EA forum comments...
What do you think of efforts like Saffron Huang et al 2025? It's from a year ago as of this week so I'd guess Anthropic to have developed this line of work further since and integrated it into other workstreams and such.
AI assistants can impart value judgments that shape people's decisions and worldviews, yet little is known empirically about what values these systems rely on in practice. To address this, we develop a bottom-up, privacy-preserving method to extract the values (normative considerations stated or demonstrated in model responses) that Claude 3 and 3.5 models exhibit in hundreds of thousands of real-world interactions. We empirically discover and taxonomize 3,307 AI values and study how they vary by context. We find that Claude expresses many practical and epistemic values, and typically supports prosocial human values while resisting values like "moral nihilism". While some values appear consistently across contexts (e.g. "transparency"), many are more specialized and context-dependent, reflecting the diversity of human interlocutors and their varied contexts. For example, "harm prevention" emerges when Claude resists users, "historical accuracy" when responding to queries about controversial events, "healthy boundaries" when asked for relationship advice, and "human agency" in technology ethics discussions. By providing the first large-scale empirical mapping of AI values in deployment, our work creates a foundation for more grounded evaluation and design of values in AI systems.
The way the benefits calculation cashes out on an individual beneficiary basis essentially requires that they (mostly under-5s) live out full lives and enjoy 40 years of increased income, it isn't a function of how long the nets last.
I'm not sure this addresses Henry's critiques? In general, every bullet listed under "I think EA has punched above its weight in many ways with respect to making AI go well" is a proxy somewhere in the middle of the ToC chain while his comment is more end-of-ToC focused as he's skeptical of the proxies actually being beneficial, and none of these bullets address the counterfactuality he brought up. In particular, and for instance, you mentioned the founding of Redwood Research as an example of EA making AI go well despite Henry explicitly being skeptical of its impact so far:
AI Safety organisations like MIRI an Redwood Research have been operating for 25 and 5 years respectively. As an outsider I coudn't point to any particular breakthrough they've made in AI alignment. Redwood seems to do some kinda interesting work on measuring rogue behaviour and creating checks. I dunno. Seems like any organisation trying to make a reliable AI product would be heavily incentivised to do this stuff regardless.
To be clear I'm not taking sides or anything, I'm just disheartened by what I perceive to be a lot of talking past each other between AIS advocates and skeptics on this forum, some of which seem easily preventable, like in this case.
Ah, sorry for misunderstanding. I don't know.