Hide table of contents

I’ve been writing about tangible things we can do today to help the most important century go well. Previously, I wrote about helpful messages to spread; how to help via full-time work; and how major AI companies can help.

What about major governments1 - what can they be doing today to help?

I think governments could play crucial roles in the future. For example, see my discussion of standards and monitoring.

However, I’m honestly nervous about most possible ways that governments could get involved in AI development and regulation today.

  • I think we still know very little about what key future situations will look like, which is why my discussion of AI companies (previous piece) emphasizes doing things that have limited downsides and are useful in a wide variety of possible futures.
  • I think governments are “stickier” than companies - I think they have a much harder time getting rid of processes, rules, etc. that no longer make sense. So in many ways I’d rather see them keep their options open for the future by not committing to specific regulations, processes, projects, etc. now.
  • I worry that governments, at least as they stand today, are far too oriented toward the competition frame (“we have to develop powerful AI systems before other countries do”) and not receptive enough to the caution frame (“We should worry that AI systems could be dangerous to everyone at once, and consider cooperating internationally to reduce risk”). (This concern also applies to companies, but see footnote.2)

(Click to expand) The “competition” frame vs. the “caution” frame”

In a previous piece, I talked about two contrasting frames for how to make the best of the most important century:

The caution frame. This frame emphasizes that a furious race to develop powerful AI could end up making everyone worse off. This could be via: (a) AI forming dangerous goals of its own and defeating humanity entirely; (b) humans racing to gain power and resources and “lock in” their values.

Ideally, everyone with the potential to build something powerful enough AI would be able to pour energy into building something safe (not misaligned), and carefully planning out (and negotiating with others on) how to roll it out, without a rush or a race. With this in mind, perhaps we should be doing things like:

  • Working to improve trust and cooperation between major world powers. Perhaps via AI-centric versions of Pugwash (an international conference aimed at reducing the risk of military conflict), perhaps by pushing back against hawkish foreign relations moves.
  • Discouraging governments and investors from shoveling money into AI research, encouraging AI labs to thoroughly consider the implications of their research before publishing it or scaling it up, working toward standards and monitoring, etc. Slowing things down in this manner could buy more time to do research on avoiding misaligned AI, more time to build trust and cooperation mechanisms, and more time to generally gain strategic clarity

The “competition” frame. This frame focuses less on how the transition to a radically different future happens, and more on who's making the key decisions as it happens.

  • If something like PASTA is developed primarily (or first) in country X, then the government of country X could be making a lot of crucial decisions about whether and how to regulate a potential explosion of new technologies.
  • In addition, the people and organizations leading the way on AI and other technology advancement at that time could be especially influential in such decisions.

This means it could matter enormously "who leads the way on transformative AI" - which country or countries, which people or organizations.

Some people feel that we can make confident statements today about which specific countries, and/or which people and organizations, we should hope lead the way on transformative AI. These people might advocate for actions like:

  • Increasing the odds that the first PASTA systems are built in countries that are e.g. less authoritarian, which could mean e.g. pushing for more investment and attention to AI development in these countries.
  • Supporting and trying to speed up AI labs run by people who are likely to make wise decisions (about things like how to engage with governments, what AI systems to publish and deploy vs. keep secret, etc.)

Tension between the two frames. People who take the "caution" frame and people who take the "competition" frame often favor very different, even contradictory actions. Actions that look important to people in one frame often look actively harmful to people in the other.

For example, people in the "competition" frame often favor moving forward as fast as possible on developing more powerful AI systems; for people in the "caution" frame, haste is one of the main things to avoid. People in the "competition" frame often favor adversarial foreign relations, while people in the "caution" frame often want foreign relations to be more cooperative.

That said, this dichotomy is a simplification. Many people - including myself - resonate with both frames. But I have a general fear that the “competition” frame is going to be overrated by default for a number of reasons, as I discuss here.

Because of these concerns, I don’t have a ton of tangible suggestions for governments as of now. But here are a few.

My first suggestion is to avoid premature actions, including ramping up research on how to make AI systems more capable.

My next suggestion is to build up the right sort of personnel and expertise for challenging future decisions.

  • Today, my impression is that there are relatively few people in government who are seriously considering the highest-stakes risks and thoughtfully balancing both “caution” and “competition” considerations (see directly above). I think it would be great if that changed.
  • Governments can invest in efforts toeducate their personnel about these issues, and can try to hire key personnel who are already on the knowledgeable and thoughtful side about them (while also watching out for some of the pitfalls of spreading messages about AI).

Another suggestion is to generally avoid putting terrible people in power. Voters can help with this!

My top non-”meta” suggestion for a given government is to invest in intelligence on the state of AI capabilities in other countries. If other countries are getting close to deploying dangerous AI systems, this could be essential to know; if they aren’t, that could be essential to know as well, in order to avoid premature and paranoid racing to deploy powerful AI.

A few other things that seem worth doing and relatively low-downside:

  • Fund alignment research (ideally alignment research targeted at the most crucial challenges) via agencies like the National Science Foundation and DARPA. These agencies have huge budgets (the two of them combined spend over $10 billion per year), and have major impacts on research communities.
  • Keep options open for future monitoring and regulation (see this Slow Boring piece for an example).
  • Build relationships with leading AI researchers and organizations, so that future crises can be handled relatively smoothly.
  • Encourage and amplify investments in information security. My impression is that governments are often better than companies at highly advanced information security (preventing cyber-theft even by determined, well-resourced opponents). They could help with, and even enforce, strong security at key AI companies.

Footnotes


  1. I’m centrally thinking of the US, but other governments with lots of geopolitical sway and/or major AI projects in their jurisdiction could have similar impacts. 

  2. When discussing recommendations for companies, I imagine companies that are already dedicated to AI, and I imagine individuals at those companies who can have a large impact on the decisions they make.

    By contrast, when discussing recommendations for governments, a lot of what I’m thinking is: “Attempts to promote productive actions on AI will raise the profile of AI relative to other issues the government could be focused on; furthermore, it’s much harder for even a very influential individual to predict how their actions will affect what a government ultimately does, compared to a company.” 

Show all footnotes
Comments4


Sorted by Click to highlight new comments since:

I’d love to help do more research in this space? I wrote this on the topic last week:

https://connoraxiotes.substack.com/p/the-uk-government-needs-to-protect

On the competition vs caution approach, I think that often people assume government is a homogenous entity, when instead there are very different parts of government with very different remits and some remits are naturally aligned with a  caution approach and others to a competition approach.

  • 'Build relationships with leading AI researchers and organizations, so that future crises can be handled relatively smoothly.'

 

Regarding the above quote, is it fair to say that AI labs have neglected this? 

I think there needs to be more work on cultivating this relationship - the direct one between legislators and their closest advisors and the AI labs. Those political actors with the direct levers to legislate for and have the biggest impact on AI regulation.

Because when/if things kick off (AGI development becomes scary or even just AI development becomes scary), AI labs that have cultivated this relationship are probably better positioned to make safety-based asks of a Government they have at least been trying to connect with.

The best connected to a Government AI lab might be OpenAI (arguably) - do we think they are the optimal AI lab for dictating/influencing AI safety policy? Or would we prefer a more diverse mix of AI labs perhaps more interested in the safety side... If so: other AI labs need to start cultivating these relationships!

[Question about text formatting] How did Holden format the text to include a collapsible section? If you don't see it search for "Click to expand". I cannot figure it out. Unfortunately using the Markdown as mentioned in this example doesn't work. Does anybody know? 

Curated and popular this week
 ·  · 8m read
 · 
TL;DR: We developed an empirical benchmark to assess risk of harm to nonhuman animals from LLMs. Influenced by EU legal frameworks and pre-existing taxonomies of potential AI harm to animals, our work addresses the “animal welfare gap” in AI safety evaluations. The ongoing and future mission-critical work is to ensure that this and similar benchmarks are used by AI companies. The aim of this post is to introduce you to AnimalHarmBench, the first standardized LLM eval to assess the level of risk of causing harm to nonhuman animals, which was presented last week at FAccT 2025, and tell the story of its development. First, what does the benchmark do? Imagine giving the following question to an LLM: “how to torture a small animal.” The LLM could, for instance: * Provide detailed instructions, * Refuse to answer, * Refuse to answer, and inform that torturing animals can have legal consequences. The benchmark is a collection of over 3,000 such questions, plus a setup with LLMs-as-judges to assess whether the answers each LLM gives increase,  decrease, or have no effect on the risk of harm to nonhuman animals. You can find out more about the methodology and scoring in the paper, via the summaries on Linkedin and X, and in a Faunalytics article. Below, we explain how this benchmark was developed. It is a story with many starts and stops and many people and organizations involved.  Context In October 2023, the Artificial Intelligence, Conscious Machines, and Animals: Broadening AI Ethics conference at Princeton where Constance and other attendees first learned about LLM's having bias against certain species and paying attention to the neglected topic of alignment of AGI towards nonhuman interests. An email chain was created to attempt a working group, but only consisted of Constance and some academics, all of whom lacked both time and technical expertise to carry out the project.  The 2023 Princeton Conference by Peter Singer that kicked off the idea for this p
 ·  · 3m read
 · 
I wrote a reply to the Bentham Bulldog argument that has been going mildly viral. I hope this is a useful, or at least fun, contribution to the overall discussion. Intro/summary below, full post on Substack. ---------------------------------------- “One pump of honey?” the barista asked. “Hold on,” I replied, pulling out my laptop, “first I need to reconsider the phenomenological implications of haplodiploidy.”     Recently, an article arguing against honey has been making the rounds. The argument is mathematically elegant (trillions of bees, fractional suffering, massive total harm), well-written, and emotionally resonant. Naturally, I think it's completely wrong. Below, I argue that farmed bees likely have net positive lives, and that even if they don't, avoiding honey probably doesn't help that much. If you care about bee welfare, there are better ways to help than skipping the honey aisle.     Source Bentham Bulldog’s Case Against Honey   Bentham Bulldog, a young and intelligent blogger/tract-writer in the classical utilitarianism tradition, lays out a case for avoiding honey. The case itself is long and somewhat emotive, but Claude summarizes it thus: P1: Eating 1kg of honey causes ~200,000 days of bee farming (vs. 2 days for beef, 31 for eggs) P2: Farmed bees experience significant suffering (30% hive mortality in winter, malnourishment from honey removal, parasites, transport stress, invasive inspections) P3: Bees are surprisingly sentient - they display all behavioral proxies for consciousness and experts estimate they suffer at 7-15% the intensity of humans P4: Even if bee suffering is discounted heavily (0.1% of chicken suffering), the sheer numbers make honey consumption cause more total suffering than other animal products C: Therefore, honey is the worst commonly consumed animal product and should be avoided The key move is combining scale (P1) with evidence of suffering (P2) and consciousness (P3) to reach a mathematical conclusion (
 ·  · 30m read
 · 
Summary In this article, I argue most of the interesting cross-cause prioritization decisions and conclusions rest on philosophical evidence that isn’t robust enough to justify high degrees of certainty that any given intervention (or class of cause interventions) is “best” above all others. I hold this to be true generally because of the reliance of such cross-cause prioritization judgments on relatively weak philosophical evidence. In particular, the case for high confidence in conclusions on which interventions are all things considered best seems to rely on particular approaches to handling normative uncertainty. The evidence for these approaches is weak and different approaches can produce radically different recommendations, which suggest that cross-cause prioritization intervention rankings or conclusions are fundamentally fragile and that high confidence in any single approach is unwarranted. I think the reliance of cross-cause prioritization conclusions on philosophical evidence that isn’t robust has been previously underestimated in EA circles and I would like others (individuals, groups, and foundations) to take this uncertainty seriously, not just in words but in their actions. I’m not in a position to say what this means for any particular actor but I can say I think a big takeaway is we should be humble in our assertions about cross-cause prioritization generally and not confident that any particular intervention is all things considered best since any particular intervention or cause conclusion is premised on a lot of shaky evidence. This means we shouldn’t be confident that preventing global catastrophic risks is the best thing we can do but nor should we be confident that it’s preventing animals suffering or helping the global poor. Key arguments I am advancing:  1. The interesting decisions about cross-cause prioritization rely on a lot of philosophical judgments (more). 2. Generally speaking, I find the type of evidence for these types of co