aog

The topline comparison between LLMs and superforecasters seems a bit unfair. You compare a single LLM's forecast against the median from a crowd of superforecasters. But we know the median from a crowd is typically more accurate than any particular member of the crowd. Therefore I think it'd be more fair to compare a single LLM to a single superforecaster, or a crowd of LLMs against a crowd of superforecasters. Do we know whether the best LLM is better than the best individual forecaster in your sample, or how the median LLM compares to the median forecaster?

(Nitpick aside, this is very interesting research, thanks for doing it.)

AI Safety Field Growth Analysis 2025

aog2mo5

Agreed it's super useful. I think it's probably significantly underestimating the size of the field though, as I think there are dozens of orgs doing at least some work on AI safety not listed here.

The Center for AI Policy Has Shut Down

aog2mo11

Thanks for this post—I agree on many of the key points. I was Longview's grant investigator on CAIP and, as I wrote in our official reply to CAIP (posted here), I wish there had been enough c4 funding available to sustain CAIP. Unfortunately, funding for 501c4 work remains scarce.

If anyone reading this is interested in contributing >$100K to 501c4 policy advocacy or any other kind of work on AI safety, please feel free to reach out to me at aidan@longview.org. We've comprehensively reviewed the 501c4 policy advocacy ecosystem and many other opportunities, and we’d be happy to offer detailed info and donation recommendations to potential large donors.

Is There An AI Safety GiveWell?

Answer by aogSep 06, 202513

Agreed with the other answers on the reasons why there's no GiveWell for AI safety. But in case it's helpful, I should say that Longview Philanthropy offers advice to donors looking to give >$100K per year to AI safety. Our methodology is a bit different from GiveWell’s, but we do use cost-effectiveness estimates. We investigate funding opportunities across the AI landscape from technical research to field-building to policy in the US, EU, and around the world, trying to find the most impactful opportunities for the marginal donor. We also do active grantmaking, such as our calls for proposals on hardware-enabled mechanisms and digital sentience. More details here. Feel free to reach out to aidan@longview.org or simran@longview.org if you'd like to learn more.

AI companies have started saying safeguards are load-bearing

aog3mo3

That’s the new PF. The old (December 2023) version defined a medium risk threshold which Deep Research surpassed.

https://cdn.openai.com/openai-preparedness-framework-beta.pdf

AI companies have started saying safeguards are load-bearing

aog3mo2

Now, Anthropic, OpenAI, Google DeepMind, and xAI say their most powerful models might have dangerous biology capabilities and thus could substantially boost extremists—but not states—in creating bioweapons.

I think the "not states" part of this is incorrect in the case of OpenAI, whose Deep Research system card said: "Our evaluations found that deep research can help experts with the operational planning of reproducing a known biological threat, which meets our medium risk threshold."

The Short Timelines Strategy for AI Safety University Groups

aog9mo7

One other potential suggestion: Organizers should consider focusing on their own career development rather than field-building if their timelines are shortening and they think they can have a direct impact sooner than they can have an impact through field-building. Personally I regret much of the time I spent starting an AI safety club in college because it traded off against building skills and experience in direct work. I think my impact through direct work has been significantly greater than my impact through field-building, and I should've spent more time on direct work in college.

Consider granting AIs freedom

aog1y4

What about corporations or nation states during times of conflict - do you think it's accurate to model them as roughly as ruthless in pursuit of their own goals as future AI agents?

They don't have the same psychological makeup as individual people, they have a strong tradition and culture of maximizing self-interest, and they face strong incentives and selection pressures to maximize fitness (i.e. for companies to profit, for nation states to ensure their own survival) lest they be outcompeted by more ruthless competitors. On average, while I'd expect that these entities tend to show some care for goals besides self-interest maximization, I think the most reliable predictor of their behavior is the maximization of their self-interest.

If they're roughly as ruthless as future AI agents, and we've developed institutions that somewhat robustly align their ambitions with pro-social action, then we should have some optimism that we can find similarly productive systems for working with misaligned AIs.

Consider granting AIs freedom

aog1y7

Human history provides many examples of agents with different values choosing to cooperate thanks to systems and institutions:

After the European Wars of Religion saw people with fundamentally different values in violent conflict with each other, political liberalism / guarantees of religious liberty / the separation of church and state emerged as worthwhile compromises that allowed people with different values to live and work together cooperatively.
Civil wars often start when one political faction loses power to another, but democracy reduces the incentive for war because it provides a peaceful and timely means for the disempowered faction to regain control of the government.
When a state guarantees property rights, people have a strong incentive not to steal from one another, but instead to engage in free and mutual beneficial trade, even if those people have values that fundamentally conflict in many ways.
Conversely, people whose property rights are not guaranteed by the state (e.g. cartels in possession of illegal drugs) may be more likely to resort to violence in protection of their property as they cannot rely on the state for that protection. This is perhaps analogous to the situation of a rogue AI agent which would be shut down if discovered.

If two agents' utility functions are perfect inverses, then I agree that cooperation is impossible. But when agents share a preference for some outcomes over others, even if they disagree about the preference ordering of most outcomes, then cooperation is possible. In such general sum games, well-designed institutions can systematically promote cooperative behavior over conflict.

Trendlines in AIxBio evals

aog1y4

Nice! This is a different question, but I'd be curious if you have any thoughts on how to evaluate risks from BDTs. There's a new NIST RFI on bio/chem models asking about this, and while I've seen some answers to the question, most of them say they have a ton of uncertainty and no great solutions. Maybe reliable evaluations aren't possible today, but what would we need to build them?

aog

Bio

Posts 9

Comments394

Posts
9

Comments
394