The AI Eval Singularity is Near
Appendix - quotes on eval saturation
A bit sad to find out that Open Philanthropy’s (now Coefficient Giving) GCR Cause Prioritization team is no more.
I heard it was removed/restructured mid-2025. Seems like most of the people were distributed to other parts of the org. I don't think there were public announcements of this, though it is quite possible I missed something.
I imagine there must have been a bunch of other major changes around Coefficient that aren't yet well understood externally. This caught me a bit off guard.
There don't seem to be many active online artifa...
I don't mean to sound too negative on this - I did just say "a bit sad" on that one specific point.
Do I think that CE is doing worse or better overall? It seems like Coefficient has been making a bunch of changes, and I don't feel like I have a good handle on the details. They've also been expanding a fair bit. I'd naively assume that a huge amount of work is going on behind the scenes to hire and grow, and that this is putting CE in a better place on average.
I would expect this (the GCR prio team change) to be some evidence that specific ambitious approac...
The GiveWell FAQ (quoted below) suggests that GiveWell focuses exclusively on human-directed interventions primarily for reasons of specialization—i.e., avoiding duplication of work already done by Coefficient Giving and others—rather than due to a principled objection to recommending animal-focused charities. If GiveWell is willing to recommend these organizations when asked, why not reduce the friction a bit?
A major part of GiveWell’s appeal has been its role as an “index fund for charities...
TBH my sense is that GiveWell is just being polite.
A perhaps more realistic motivation is that admitting animal suffering into GiveWell's models would implicitly force them to specify moral weights for animals (versus humans), and there is no way to do that without inviting huge controversy leaving at least some groups very upset. Much easier to say "sorry, not our wheelhouse" and effectively set animal weights to zero.
FWIW I agree with this decision (of GiveWell's).
It seems like a worthwhile project to ask/pressure Anthropic's founders to make their pledges legally binding.
Anthropic's founders have pledged to donate 80% of their wealth. Ozzie Gooen estimates that in a few years this could be worth >$40 billion.
As Ozzie writes, adherence to the Giving Pledge (the Gates one) is pretty low: only 36% of deceased original pledgers met the 50% commitment. It's hard to follow through on such commitments, even for (originally) highly morally motivated people.
I'm going to guess the total donated will be 30% of this by EA funders, and a low percentage by the rest. I think your conservative number is WAY too low based on previous pledge fulfillment rates. I get that it's just a claude generation though
But that's still 2 billion dollars at least, so I've updated positively on the amount of money that might go to good causes. Thanks for this @Ozzie Gooen strong upvote.
Mental health support for those working on AI risks and policy?
During the numerous projects I work on relating to AI risks, policies, and future threats/scenarios, I speak to a lot of people who bring exposed to issues of catastrophic and existential nature for the first time (or grappling with them for the first time in detail). This combined with the likelihood that things will get worse before they better, makes me frequently wonder: are we doing enough around mental health support?
Things that I don’t know exist but feel they should. Some may sound OTT ...
It is popular to hate on Swapcard, and yet Swapcard seems like the best available solution despite its flaws. Claude Code or other AI coding assistants are very good nowadays, and conceivably, someone could just Claude Code a better Swapcard that maintained feature parity while not having flaws.
Overall I'm guessing this would be too hard right now, but we do live in an age of mysteries and wonders. It gets easier every month. One reason for optimism is it seems like the Swapcard team is probably not focused on the somewhat odd use case of EAGs in general (...
What are people's favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger?
Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates.
I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.
I've been experimenting recently with a longtermist wiki, written fully with LLMs.
Some key decisions/properties:
1. Fully LLM-generated, heavily relying on Claude Code.
2. Somewhat opinionated. Tries to represent something of a median longtermist/EA longview, with a focus on the implications of AI. All pages are rated for "importance".
3. Claude will estimates a lot of percentages and letter grades for things. If you see a percentage or grade, and there's no citation, it might well be a guess by Claude.
4. An emphasis on numeric estimates, models, and diagrams...
The next PauseAI UK protest will be (AFAIK) the first coalition protest between different AI activist groups, the main other group being Pull the Plug, a new organisation focused primarily on current AI harms. It will almost certainly be the largest protest focused exclusively on AI to date.
In my experience, the vast majority of people in AI safety are in favor of big-tent coalition protests on AI in theory. But when faced with the reality of working with other groups who don't emphasize existential risk, they have misgivings. So I'm curious what people he...
Consider adopting the term o-risk.
William MacAskill has recently been writing a bunch about how if you’re a Long-Termist, it’s not enough merely to avoid the catastrophic outcomes. Even if we get a decent long-term future, it may still fall far short of the best future we could have achieved. This outcome — of a merely okay future, when we could have had a great future — would still be quite tragic.
Which got me thinking: EAs already have terms like x-risk (for existential risks, or things which could cause human extinction) and s-risk (for suffering risks,...
might want to check out this (only indirectly related but maybe useful).
Personally don't mind o-risk think it has some utility but s-risk ~somewhat seems like it still works here. An O-risk is just a smaller scale s-risk no?
Thanks to everyone who voted for our next debate week topic! Final votes were locked in at 9am this morning.
We can’t announce a winner immediately, because the highest karma topic (and perhaps some of the others) touches on issues related to our politics on the EA Forum policy. Once we’ve clarified which topics we would be able to run, we’ll be able to announce a winner.
Once we have, I’ll work on honing the exact wording. I’ll write a post with a few options, so that you can have input into the exact version we end up discussing.
PS: ...
Nice one @Toby Tremlett🔹 . If the forum dictators decide that the democratically selected topic of democratic backsliding is not allowed, I will genuinely be OK with that decision ;).
Consultancy Opportunities – Biological Threat Reduction 📢📢📢
The World Organisation for Animal Health (WOAH) is looking for two consultants to support the implementation of the Fortifying Institutional Resilience Against Biological Threats (FIRABioT) Project in Africa. Supported by Global Affairs Canada's Weapons Threat Reduction Program, this high-impact initiative aims to support WOAH Members in strengthening capacities to prevent, detect, prepare, respond and recover from biological threats. The project also supports the implementation of th...
EA Animal Welfare Fund almost as big as Coefficient Giving FAW now?
This job ad says they raised >$10M in 2025 and are targeting $20M in 2026. CG's public Farmed Animal Welfare 2025 grants are ~$35M.
Is this right?
Cool to see the fund grow so much either way.
Agree that it’s really great to see the fund grow so much!
That said, I don’t think it’s right to say it’s almost as large as Coefficient Giving. At least not yet... :)
The 2025 total appears to exclude a number of grants (including one to Rethink Priorities) and only runs through August of that year. By comparison, Coefficient Giving’s farmed animal welfare funding in 2024 was around $70M, based on the figures published on their website.
A delightful thing happened a couple weeks ago, and it gives an example for why more people should comment on the forum.
My forum profile is pretty scarce, less than a dozen comments, most of them are along the lines of 'I appreciate the work done here!'. Nevertheless, because I have linked in some social media profiles and set my city in the directory, a student from a nearby university reached out to ask about career advice after finding me on the forum. I gave her a personalised briefing on the local policy space and explained the details of how to...
Lots of “entry-level” jobs require applicants to have significant prior experience. This seems like a catch-22: if entry-level positions require experience, how are you supposed to get the experience in the first place? Needless to say, this can be frustrating. But we don’t think this is (quite) as paradoxical as it sounds, for two main reasons.
1: Listed requirements usually aren't as rigid as they seem.
Employers usually expect that candidates won’t meet all of the “essential” criteria. These are often more of a wish list than an exhaustive list...
Anecdotally, it seems like many employers have become more selective about qualifications, particularly in tech where the market got really competitive in 2024 - junior engineers were suddenly competing with laid-off senior engineers and FAANG bros.
Also, per their FAQ, Capital One has a policy not to select candidates who don't meet the basic qualifications for a role. One Reddit thread says this is also true for government contractors. Obviously this may vary among employers - is there any empirical evidence on how often candidates get hired without meeti...
@Ryan Greenblatt and I are going to record another podcast together (see the previous one here). We'd love to hear topics that you'd like us to discuss. (The questions people proposed last time are here, for reference.) We're most likely to discuss issues related to AI, but a broad set of topics other than "preventing AI takeover" are on topic. E.g. last time we talked about the cost to the far future of humans making bad decisions about what to do with AI, and the risk of galactic scale wild animal suffering.
Much of the stuff that catches your interest on the 80,000 hours website's problem profiles would be something I'd like to watch you do a podcast on, or costly if I end up getting it from people whose work I'm less familiar with. Also, neurology, cogpsych/evopsych/epistasis (e.g. like this 80k podcast with Randy Neese, this 80k podcast with Athena Aktipis), and especially more quantitative modelling approaches to culture change/trends (e.g. 80k podcast with Cass Sunstein, 80k podcast with Tom Moynihan, 80k podcasts with David Duvenaud and Karnofsky). A lot...
Not sure who needs to hear this, but Hank Green has published two very good videos about AI safety this week: an interview with Nate Soares and a SciShow explainer on AI safety and superintelligence.
Incidentally, he appears to have also come up with the ITN framework from first principles (h/t @Mjreard).
Hopefully this is auspicious for things to come?
I'm researching how safety frameworks of frontier labs (Anthropic RSP, OpenAI Preparedness Framework, DeepMind FSF) have changed between versions.
Before I finish the analysis, I'm collecting predictions to compare with actual findings later. 5 quick questions. Questions
Disclaimer: please take it with a grain of salt, questions drafted quickly with AI help, treating this as a casual experiment, not rigorous research.
Thanks if you have a moment
@Toby Tremlett🔹 and I will be repping the EA Forum Team at EAG SF in mid-Feb — stop by our office hours to ask questions, give us your hottest Forum takes, or just say hi and come get a surprise sweet! :)
I’ve seen a few people in the LessWrong community congratulate the community on predicting or preparing for covid-19 earlier than others, but I haven’t actually seen the evidence that the LessWrong community was particularly early on covid or gave particularly wise advice on what to do about it. I looked into this, and as far as I can tell, this self-congratulatory narrative is a complete myth.
Many people were worried about and preparing for covid in early 2020 before everything finally snowballed in the second week of March 2020. I remember it personally....
Following up a bit on this, @parconley. The second post in Zvi's covid-19 series is from 6pm Eastern on March 13, 2020. Let's remember where this is in the timeline. From my quick take above:
...On March 8, 2020, Italy put a quarter of its population under lockdown, then put the whole country on lockdown on March 10. On March 11, the World Health Organization declared covid-19 a global pandemic. (The same day, the NBA suspended the season and Tom Hanks publicly disclosed he had covid.) On March 12, Ohio closed its schools statewide. The U.S. declared a nationa