Announcing the SPT Model Web App for AI Governance

Paolo Bova; Robert Trager; Tanja Rüegg; Jonas Emanuel Müller; Modeling Cooperation

Announcing the SPT Model Web App for AI Governance

Paolo Bova,

Comments

Sorted by

New & upvoted

No comments on this post yet.

Be the first to respond.

Comments

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·5d ago·Curated 2d ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

151

Let's taboo the V-word

lincolnq·6d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

105

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·3d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

^{^}

Robert Trager, Paolo Bova, Nicholas Emery-Xu, Eoghan Stafford, and Allan Dafoe, "Welfare Implications of Safety-Performance Tradeoffs in AI Safety Research", Working paper, August 2022.

^{^}

Worrisome risk compensation can occur without competition in emerging technology contexts, but competitive dynamics often exaggerate the effects. Other key factors that make emerging technology contexts more susceptible to risk compensation are negative externalities from the implementation of new technologies and the absence of regulation.

^{^}

In some scenarios, a safety improvement can even lead to higher total risk than if the safety improvement had not been discovered. This edge case requires that the safety improvement scales to all levels of performance and that the winner of the competition is determined using a particular functional form. Both these assumptions are quite restrictive, so while this result is plausible in the model it is also highly unlikely to apply in practice.

^{^}

Our model ignores the possibility that the safety improvement makes researchers more likely to develop new technologies which push out the performance boundary. If this is the case, some risk compensation will occur, though likely to a lesser degree than when companies operate far from the performance boundary.

^{^}

To see a related model of a safety-performance tradeoff which examines the role of information, see this recent post which summarizes Nicholas Emery-Xu, Andrew Park, and Robert Trager, Information Hazards in Races for Advanced Artificial Intelligence, Working paper, June 2022.

^{^}

Eoghan Stafford, Robert Trager and Allan Dafoe, Safety Not Guaranteed: International Strategic Dynamics of Risky Technology Races, Working paper, July 2022.

^{^}

Eoghan Stafford and Robert Trager, The IAEA Solution: Knowledge Sharing to Prevent Dangerous Technology Races, June 2022.

Announcing the SPT Model Web App for AI Governance

Announcing the SPT Model Web App for AI Governance

Will AI safety breakthroughs always lead to safer AI systems?

Announcing the web app

Answering the question