The AI Eval Singularity is Near
* AI capabilities seem to be doubling every 4-7 months
* Humanity's ability to measure capabilities is growing much more slowly
* This implies an "eval singularity": a point at which capabilities grow faster than our ability to measure them
* It seems like the singularity is ~here in cybersecurity, CBRN, and AI R&D (supporting quotes below)
* It's possible that this is temporary, but the people involved seem pretty worried
Appendix - quotes on eval saturation
Opus 4.6
* "For AI R&D capabilities, we found that Claude Opus 4.6 has saturated most of our
automated evaluations, meaning they no longer provide useful evidence for ruling out ASL-4 level autonomy. We report them for completeness, and we will likely discontinue them going forward. Our determination rests primarily on an internal survey of Anthropic staff, in which 0 of 16 participants believed the model could be made into a drop-in replacement for an entry-level researcher with scaffolding and tooling improvements within three months."
* "For ASL-4 evaluations [of CBRN], our automated benchmarks are now largely saturated and no longer provide meaningful signal for rule-out (though as stated above, this is not indicative of harm; it simply means we can no longer rule out certain capabilities that may be pre-requisities to a model having ASL-4 capabilities)."
* It also saturated ~100% of the cyber evaluations
Codex-5.3
* "We are treating this model as High [for cybersecurity], even though we cannot be certain that it actually has these capabilities, because it meets the requirements of each of our canary thresholds and we therefore cannot rule out the possibility that it is in fact Cyber High."
I'm researching how safety frameworks of frontier labs (Anthropic RSP, OpenAI Preparedness Framework, DeepMind FSF) have changed between versions.
Before I finish the analysis, I'm collecting predictions to compare with actual findings later. 5 quick questions. Questions
Disclaimer: please take it with a grain of salt, questions drafted quickly with AI help, treating this as a casual experiment, not rigorous research.
Thanks if you have a moment
We would welcome guest posts for Windfall's Trust AI Economics Brief here! If you're interested in summarizing your latest research for a broader audience of decision-makers or want to share a thoughtful take, this is an opportunity to do so!
[Adapted from this comment.]
Two pieces of evidence commonly cited for near-term AGI are AI 2027 and the METR time horizons graph. AI 2027 is open to multiple independent criticisms, one of which is its use of the METR time horizons graph to forecast near-term AGI or AI capabilities more generally. Using the METR graph to forecast near-term AGI or AI capabilities more generally is not supported by the data and methodology used to make the graph.
Two strong criticisms that apply specifically to the AI 2027 forecast are:
* It depends crucially on the subjective intuitions or guesses of the authors. If you don't personally share the authors' intuitions, or don't personally trust that the authors' intuitions are likely correct, then there is no particular reason to take AI 2027's conclusions seriously.
* Credible critics claim that the headline results of the AI 2027 timelines model are largely baked in by the authors' modelling decisions, irrespective of what data the model uses. That means, to a large extent, AI 2027's conclusions are not actually determined by the data they use. We already saw with the previous bullet point that the conclusions of AI 2027 are largely a restatement of the authors' personal and contestable beliefs. This is another way in which AI 2027's conclusions are, effectively, a restatement of the pre-existing beliefs or assumptions that the authors chose to embed in their timelines model.
AI 2027 is largely based on extrapolating the METR time horizons graph. The following criticisms of the METR time horizons graph therefore extend to AI 2027:
* Some of the serious problems and limitations of the METR time horizons graph are sometimes (but not always) clearly disclosed by METR employees. Note the wide difference between the caveated description of what the graph says and the interpretation of the graph as a strong indicator of rapid, exponential improvement in general AI capabilities.
* Gary Marcus, a cognitive scientist and AI research
Technical Alignment Research Accelerator (TARA) applications close today!
Last chance to apply to join the 14-week, remotely taught, in-person run program (based on the ARENA curriculum) designed to accelerate APAC talent towards meaningful technical AI safety research.
TARA is built for you to learn around full-time work or study by attending meetings in your home city on Saturdays and doing independent study throughout the week. Finish the program with a project to add to your portfolio, key technical AI safety skills, and connections across APAC.
See this post for more information and apply through our website here.
Are there any signs of governments beginning to do serious planning for the need for Universal Basic Income (UBI) or negative income tax...it feels like there's a real lack of urgency/rigour in policy engagement within government circles. The concept has obviously had its high-level advocates a la Altman but it still feels incredibly distant as any form of reality.
Meanwhile the impact is being seen in job markets right now - in the UK graduate job opening have plummeted in the last 12 months. People I know are having a hard enough time finding jobs with elite academic backgrounds - let alone the vast majority of people who went to average universities. This is happening today - before there's any consensus of arrival of AGI and widely recognised mass displacement in mid-career job markets. Impact is happening now, but preparation for major policy intervention in current fiscal scenarios seems really far off. If governments do view the risk of major employment market disruption as a realistic possibility (which I believe in many cases they do) are they planning for interventions behind the scene? Or do they view the problem as too big to address until it arrives...viewing rapid response > careful planning in the way the COVID emergency fiscal interventions emerged.
Would be really interested to hear of any good examples of serious thinking/preparation of how some form of UBI could be planned for (logistically and fiscally) in the near time 5 year horizon.
Quick link-post highlighting Toner quoting Postrel’s dynamist rules + her commentary. I really like the dynamist rules as a part of the vision of the AGI future we should aim for:
“Postrel does describe five characteristics of ‘dynamist rules’:
I see some overlap with existing ideas in AI policy:
* Transparency, everyone’s favorite consensus recommendation, fits well into a dynamist worldview. It helps with Postrel’s #1 (giving individuals access to better information that they can act on as they choose), #3 (facilitating commitments), and #4 (facilitating criticism and feedback). Ditto whistleblower protections.
* Supporting the development of a third-party audit ecosystem also fits—it helps create and enforce credible commitments, per #3, and could be considered a kind of nestable framework, per #5.
* The value of open models in driving decentralized use, testing, and research is obvious through a dynamist lens, and jibes with #1 and #4. (I do think there should be some precautionary friction before releasing frontier models openly, but that’s a narrow exception to the broader value of open source AI resources.)
Another good bet is differential technological development, aka defensive accelerationism—proactively building technologies that help manage challenges posed by other technologies—though I can’t easily map it onto Postrel’s five characteristics. I’d be glad to hear readers’ ideas for other productive directions to push in.”
Dwarkesh (of the famed podcast) recently posted a call for new guest scouts. Given how influential his podcast is likely to be in shaping discourse around transformative AI (among other important things), this seems worth flagging and applying for (at least, for students or early career researchers in bio, AI, history, econ, math, physics, AI that have a few extra hours a week).
The role is remote, pays ~$100/hour, and expects ~5–10 hours/week. He’s looking for people who are deeply plugged into a field (e.g. grad students, postdocs, or practitioners) with high taste. Beyond scouting guests, the role also involves helping assemble curricula so he can rapidly get up to speed before interviews.
More details are in the blog post; link to apply (due Jan 23 at 11:59pm PST).