I'm a researcher at Forethought; before that, I ran the non-engineering side of the EA Forum (this platform), ran the EA Newsletter, and worked on some other content-related tasks at CEA. [More about the Forum/CEA Online job.]
Selected posts
Background
I finished my undergraduate studies with a double major in mathematics and comparative literature in 2021. I was a research fellow at Rethink Priorities in the summer of 2021 and was then hired by the Events Team at CEA. I later switched to the Online Team. In the past, I've also done some (math) research and worked at Canada/USA Mathcamp.
the main question is how high a priority this is, and I am somewhat skeptical it is on the ITN pareto frontier. E.g. I would assume plenty of people care about government efficiency and state capacity generally, and a lot of these interventions are generally about making USG more capable rather than too targeted towards longtermist priorities.
Agree that "how high-priority should this be" is a key question, and I'm definitely not sure it's on the ITN pareto frontier! (Nice phrase, btw.)
Quick notes on some things that raise the importance for me, though:
And I'm pretty worried that a decent amount of work aimed at mitigating the risks of AI could end up net-negative (for its own goals) by not tracking this issue and thus not focusing enough on the interventions that are actually worth pursuing --- further harming government AI adoption & competence / capacity in the process (e.g. I think some of the OMB/EO guidance from last year looked positive to me before I dug into this, and now looks negative). So I'd like to nudge some people who work on issues related to existential risk (and government) away from a view like: "all AI is scary/bad, anything that is 'pro-AI' increases existential risk, if this bundle of policies/barriers inhibits a bunch of different AI things then that's probably great even if I think only a tiny fraction is truly (existentially) risky", etc.
--
this felt like neither the sort of piece targeted to mainstream US policy folks, nor that convincing for why this should be an EA/longtermist focus area.
Totally reasonable reaction IMO. To a large extent I see this as a straightforward flaw of the piece & how I approached it (partly due to lack of time - see my reply to Michael above), although I'll flag that my main hope was to surface this to people who are in fact kind of in between -- e.g. folks at think tanks that do research on existential security and have government experience/expertise.
--
I'm unconvinced that e.g. OP should spin up a grantmaker focused on this (not that you were necessarily recommending this).
I am in fact not recommending this! (There could be specific interventions in the area that I'd see as worth funding, though, and it's also related to other clusters where something like the above is reasonable IMO.)
--
Also, a few reasons govts may have a better time adopting AI come to mind:
- Access to large amounts of internal private data
- Large institutions can better afford one-time upfront costs to train or finetune specialised models, compared to small businesses
But I agree the opposing reasons you give are probably stronger.
The data has to be accessible, though, and this is a pretty big problem. See e.g. footnote 17.
I agree that a major advantage could be that the federal government can in fact move a lot of money when ~it wants to, and could make some (cross-agency/...) investments into secure models or similar, although my sense is that right now that kind of thing is the exception/aspiration, not the rule/standard practice. (Another advantage is that companies do want to maintain good relationships with the government/admin, and might thus invest more in being useful. Also there are probably a lot of skilled people who are willing to help with this kind of work, for less personal gain.)
--
If only this were how USG juggled its priorities!
🙃 (some decision-makers do, though!)
I imagine there might be some very clever strategies to get a lot of the benefits of AI without many of the normal costs of integration.
For example:
- The federal government makes heavy use of private contractors. These contractors are faster to adopt innovations like AI.
- There are clearly some subsets of the government that matter far more than others. And there are some that are much easier to improve than others.
- If AI strategy/intelligence is cheap enough, most of the critical work can be paid for by donors. For example, we have a situation where there's a think tank that uses AI to figure out the best strategies/plans for much of the government, and government officials can choose to pay attention to this.
I'd be excited to see more work in this direction!
Quick notes: I think (1) is maybe the default way I expect things to go fine (although I have some worries about worlds where almost all US federal govt AI capacity is via private contractors). (2) seems right, and I'd want someone who has (or can develop) a deeper understanding of this area than me to explore this. Stuff like (3) seems quite useful, although I'm worried about things like ensuring access to the right kind of data and decision-makers (but partnerships / a mix of (2) and (3) could help).
(A lot of this probably falls loosely under "build capacity outside the US federal government" in my framework, but I think the lines are very blurry / a lot of the same interventions help with appropriate use/adoption of AI in the government and external capacity. )
all very similar to previous thinking on how forecasting can be useful to the government
I hadn't thought about this — makes sense, and a useful flag, thank you! (I might dig into this a bit more.)
Thanks for this comment! I don’t view it as “overly critical.”
Quickly responding (just my POV, not Forethought’s!) to some of what you brought up ---
(This ended up very long, sorry! TLDR: I agree with some of what you wrote, disagree with some of the other stuff / think maybe we're talking past each other. No need to respond to everything here!)
A. Motivation behind writing the piece / target audience/ vibe / etc.
Re:
…it might help me if you explained more about the motivation [behind writing the article] [...] the article reads like you decided the conclusion and then wrote a series of justifications
I’m personally glad I posted this piece, but not very satisfied with it for a bunch of reasons, one of which is that I don’t think I ever really figured out what the scope/target audience should be (who I was writing for/what the piece was trying to do).
So I agree it might help to quickly write out the rough ~history of the piece:
In particular I don’t expect (and wasn’t expecting) that ~policymakers will read this, but hope it’s useful for people at relevant think tanks or similar who have more government experience/knowledge but might not be paying attention to one “side” of this issue or the other. (For instance, I think a decent fraction of people worried about existential risks from advanced AI don’t really think about how using AI might be important for navigating those risks, partly because all of AI kinda gets lumped together).
Quick responses to some other things in your comment that seem kinda related to what I'm responding to in this “motivation/vibe/…” cluster:
I also found it odd that the report did not talk about extinction risk. In its list of potential catastrophic outcomes, the final item on the list was "Human disempowerment by advanced AI", which IMO is an overly euphemistic way of saying "AI will kill everyone".
We might have notably different worldviews here (to be clear mine is pretty fuzzy!). For one thing, in my view many of the scary “AI disempowerment” outcomes might not in fact look immediately like “AI kills everyone” (although to be clear that is in fact an outcome I’m very worried about), and unpacking what I mean by "disempowerment" in the piece (or trying to find the ideal way to say it) didn't seem productive -- IIRC I wrote something and moved on. I also want to be clear that rogue AI [disempowering] humans is not the only danger I’m worried about, i.e. it doesn’t dominate everything else for me -- the list you're quoting from wasn't an attempt to mask AI takeover, but rather a sketch of the kind of thing I'm thinking about. (Note: I do remember moving that item down the list at some point when I was working on a draft, but IIRC this was because I wanted to start with something narrower to communicate the main point, not because I wanted to de-emphasize ~AI takeover.)
By my reading, this article is meant to be the sort of Very Serious Report That Serious People Take Seriously, which is why it avoids talking about x-risk.
I might be failing to notice my bias, but I basically disagree here --- although I do feel a different version of what you're maybe pointing to here (see next para). I was expecting that basically anyone who reads the piece will already have engaged at least a bit with "AI might kill all humans", and likely most of the relevant audience will have thought very deeply about this and in fact has this as a major concern. I also don't personally feel shy about saying that I think this might happen — although again I definitely don't want to imply that I think this is overwhelmingly likely to happen or the only thing that matters, because that's just not what I believe.
However I did occasionally feel like I was ~LARPing research writing when I was trying to articulate my thoughts, and suspect some of that never got resolved! (And I think I floundered a bit on where to go with the piece when getting contradicting feedback from different people - although ultimately the feedback was very useful.) In my view this mostly shows up in other ways, though. (Related - I really appreciated Joe Carlsmith's recent post on fake thinking and real thinking when trying to untangle myself here.)
B. Downside risks of the proposed changes
C. Is gov competence actually a bottleneck?
I don't think government competence is what's holding us back from having good AI regulations, it's government willingness. I don't see how integrating AI into government workflow will improve AI safety regulations (which is ultimately the point, right?[^1]), and my guess is on balance it would make AI regulations less likely to happen because policy-makers will become more attached to their AI systems and won't want to restrict them.
My view is that you need both, we're not on track for competence, and we should be pretty uncertain about what happens on the willingness side.
D. Michael’s line item responses
1.
> invest in AI and technical talent
What does that mean exactly? I can't think of how you could do that without shortening timelines so I don't know what you have in mind here.
I’m realizing this can be read as “invest in AI and in technical talent” — I meant “invest in AI talent and (broader) technical talent (in govt).” I’m not sure if this just addresses the comment; my guess is that doing this might have a tiny shortening effect on timelines (but is somewhat unclear, partly because in some cases e.g. raising salaries for AI roles in govt might draw people away from frontier AI companies), but this is unlikely to be the decisive factor. (Maybe related: my view is that generally this kind of thing should be weighed instead of treated as a reason to entirely discard certain kinds of interventions.)
2.
> Streamline procurement processes for AI products and related tech
I also don't understand this. Procurement by whom, for what purpose? And again, how does this not shorten timelines? (Broadly speaking, more widespread use of AI shortens timelines at least a little bit by increasing demand.)
I was specifically talking about agencies’ procurement of AI products — e.g. say the DOE wants a system that makes forecasting demand easier or whatever; making it easier for them to actually get such a system faster. I think the effect on timelines will likely be fairly small here (but am not sure), and currently think it would be outweighed by the benefits.
3.
> Gradual adoption is significantly safer than a rapid scale-up.
This sounds plausible but I am not convinced that it's true, and the article presents no evidence, only speculation. I would like to see more rigorous arguments for and against this position instead of taking it for granted.
I’d be excited to see more analysis on this, but it’s one of the points I personally am more confident about (and I will probably not dive in right now).
4.
> And in a crisis — e.g. after a conspicuous failure, or a jump in the salience of AI adoption for the administration in power — agencies might cut corners and have less time for security measures, testing, in-house development, etc.
This line seems confused. Why would a conspicuous failure make government agencies want to suddenly start using the AI system that just conspicuously failed? Seems like this line is more talking about regulating AI than adopting AI, whereas the rest of the article is talking about adopting AI.
Sorry, again my writing here was probably unclear; the scenarios I was picturing were more like:
Not sure if that answers the question/confusion?
5.
> Frontier AI development will probably concentrate, leaving the government with less bargaining power.
I don't think that's how that works. Government gets to make laws. Frontier AI companies don't get to make laws. This is only true if you're talking about an AI company that controls an AI so powerful that it can overthrow the government, and if that's what you're talking about then I believe that would require thinking about things in a very different way than how this article presents them.
This section is trying to argue that AI adoption will be riskier later on, so the “bargaining power” I was talking about here is the bargaining power of the US federal govt (or of federal agencies) as a customer; the companies it’s buying from will have more leverage if they’re effectively monopolies. My understanding is that there are already situations where the US govt has limited negotiation power and maybe even makes policy concessions to specific companies specifically because of its relationship to those companies — e.g. in defense (Lockheed Martin, etc., although this is also kinda complicated) and again maybe Microsoft.
And: would adopting AI (i.e. paying frontier companies so government employees can use their products) reduce the concentration of power? Wouldn't it do the opposite?
Again, the section was specifically trying to argue that later adoption is scarier than earlier adoption (in this case because there are (still) several frontier AI companies). But I do think that building up internal AI capacity, especially talent, would reduce the leverage any specific AI company has over the US federal government.
6.
> It’s natural to focus on the broad question of whether we should speed up or slow down government AI adoption. But this framing is both oversimplified and impractical
Up to this point, the article was primarily talking about how we should speed up government AI adoption. But now it's saying that's not a good framing? So why did the article use that framing? I get the sense that you didn't intend to use that framing, but it comes across as if you're using it.
Yeah, I don't think I navigated this well! (And I think I was partly talking ti myself here.) But maybe my “motivation” notes above give some context?
In terms of the specific “position” I in practice leaned into: Part of why I led with the benefits of AI adoption was the sense that the ~existential risk community (which is most of my audience) generally focuses on risks of AI adoption/use/products, and that's where my view diverges more. There's also been more discussion, from an existential risk POV, of the risks of adoption than there has been of the benefits, so I didn't feel that elaborating too much on the risks would be as useful.
7.
> Hire and retain technical talent, including by raising salaries
I would like to see more justification for why this is a good idea. The obvious upside is that people who better understand AI can write more useful regulations. On the other hand, empirically, it seems that people with more technical expertise (like ML engineers) are on average less in favor of regulations and more in favor of accelerating AI development (shortening timelines, although they usually don't think "timelines" are a thing). So arguably we should have fewer such people in positions of government power.
The TLDR of my view here is something like "without more internal AI/technical talent (most of) the government will be slower on using AI to improve its work & stay relevant, which I think is bad, and also it will be increasingly reliant on external people/groups/capacity for technical expertise --- e.g. relying on external evals, or trusting external advice on what policy options make sense, etc. and this is bad."
8.
> Explore legal or other ways to avoid extreme concentration in the frontier AI market
[...]
The linked article attached to this quote says "It’s very unclear whether centralizing would be good or bad", but you're citing it as if it definitively finds centralization to be bad.
(The linked article is this one: https://www.forethought.org/research/should-there-be-just-one-western-agi-project )
I was linking to this to point to relevant discussion, not as a justification for a strong claim like “centralization is definitively bad” - sorry for being unclear!
9.
> If the US government never ramps up AI adoption, it may be unable to properly respond to existential challenges.
What does AI adoption have to do with the ability to respond to existential challenges? It seems to me that once AI is powerful enough to pose an existential threat, then it doesn't really matter whether the US government is using AI internally.
I suspect we may have fairly different underlying worldviews here, but maybe a core underlying belief on my end is that there are things that it's helpful for the government to do before we get to ~ASI, and also there will be AI tools pre ~ASI that are very helpful for doing those things. (Or an alt framing: the world will get ~/fast/complicated/weird due to AI before there’s nothing the US gov could in theory do to make things go better.)
10.
> Map out scenarios in which AI safety regulation is ineffective and explore potential strategies
I don't think any mapping is necessary. Right now AI safety regulation is ineffective in every scenario, because there are no AI safety regulations (by safety I mean notkilleveryoneism). Trivially, regulations that don't exist are ineffective. Which is one reason why IMO the emphasis of this article is somewhat missing the mark—right now the priority should be to get any sort of safety regulations at all.
I fairly strongly disagree here (with "the priority should be to get any sort of safety regulations at all") but don't have time to get into it, really sorry!
---
Finally, thanks a bunch for saying that you enjoyed some of my earlier writing & I changed your thinking on slow vs quick mistakes! That kind of thing is always lovely to hear.
(Posted on my phone— sorry for typos and similar!)
Quick (visual) note on something that seems like a confusion in the current conversation:
Others have noted similar things (eg, and Will’s earlier take on total vs human extinction). You might disagree with the model (curious if so!), but I’m a bit worried that one way or another people are talking past each other (least from skimming the discussion).
(Commenting via phone, sorry for typos or similar!)
What actually changes about what you’d work on if you concluded that improving the future is more important on the current margin than trying to reduce the chance of (total) extinction (or vice versa)?
Curious for takes from anyone!
I wrote a Twitter thread that summarizes this piece and has a lot of extra images (I probably went overboard, tbh.)
I kinda wish I'd included the following image in the piece itself, so I figured I'd share it here:
Follow-up:
Quick list of some ways benchmarks might be (accidentally) misleading[1]
Additions are welcome! (Also, I couldn't quickly find a list like this earlier, but I'd be surprised if a better version than what I have above wasn't available somewhere; I'd love recommendations.)
Open Phil's announcement of their now-closed benchmarking RFP has some useful notes on this, particularly the section on "what makes for a strong benchmark."I also appreciated METR's list of desiderata here.
To be clear: I'm not trying to say anything on ways benchmarks might be useful/harmful here. And I'm really not an expert.
This paper looks relevant but I haven't read it.
TLDR: Notes on confusions about what we should do about digital minds, even if our assessments of their moral relevance are correct[1]
I often feel quite lost when I try to think about how we can “get digital minds right.” It feels like there’s a variety of major pitfalls involved, whether or not we’re right about the moral relevance of some digital minds.
Digital-minds-related pitfalls in different situations | ||
Reality ➡️ Our perception ⬇️ | These digital minds are (non-trivially) morally relevant[2] | These digital minds are not morally relevant |
We see these digital minds as morally relevant | (1) We’re right But we might still fail to understand how to respond, or collectively fail to implement that response. | (2) We’re wrong We confuse ourselves, waste an enormous amount of resources on this[3] (potentially sacrificing the welfare of other beings that do matter morally), and potentially make it harder to respond to the needs of other digital minds in the future (see also). |
We don’t see these digital minds as morally relevant | (3) We’re wrong The default result here seems to be moral catastrophe through ignorance/sleepwalking (although maybe we’ll get lucky). | (4) We’re right All is fine (at least on this front). |
Categories (2) and (3) above are ultimately about being confused about which digital minds matter morally — having an inappropriate level of concern, one way or another. A lot of the current research on digital minds seems to be aimed at avoiding this issue. (See e.g. and e.g..) I’m really glad to see this work; the pitfalls in these categories worry me a lot.
But even if we end up in category (1) — we realize correctly that certain digital minds are morally relevant — how can we actually figure out what we should do? Understanding how to respond probably involves answering questions like:
Answering these questions seem really difficult.
In many cases, extrapolating from what we can tell about humans seems inappropriate.[4] (Do digital minds systems find joy or meaning in some activities? Do they care about survival? What does it mean for a digital mind/AI system to be in pain? Is it ok to lie to AI systems? Is it ok to design AI systems that have no ~goals besides fulfilling requests from humans?)
And even concepts we have for thinking about what matters to humans (or animals) often seem ill-suited for helping us with digital minds.[5] (When does it make sense to talk about freedoms and rights,[6] or the sets of (relevant) capabilities a digital mind has, or even their preferences? What even is the right “individual” to consider? Self-reports seem somewhat promising, but when can we actually rely on them as signals about what’s important for digital minds?)
I’m also a bit worried that too much of the work is going towards the question of “which systems/digital minds are morally relevant” vs to the question of “what do we do if we think that a system is morally relevant (or if we’re unsure)?” (Note however that I've read a tiny fraction of this work, and haven't worked in the space myself!) E.g. this paper focuses on the former and closes by recommending that companies prepare to make thoughtful decisions about how to treat the AI systems identified as potentially morally significant by (a) hiring or appointing a DRI (directly responsible individual) for AI welfare and (b) developing certain kinds of frameworks for AI welfare oversight. These steps do seem quite good to me (at least naively), but — as the paper explicitly acknowledges — they’re definitely not sufficient.
Some of the work that’s apparently focused at the question of "which digital minds matter morally" involves working towards theories of ~consciousness, and maybe that will also help us with the latter question. But I’m not sure.
(So my quite uninformed independent impression is that it might be worth investing a bit more in trying to figure out what we should do if we do decide that some digital minds might be morally relevant, or maybe what we should do if we find that we’re making extremely little progress on figuring out whether they are.)
These seem like hard problems/questions, but I want to avoid defeatism.
I appreciated @rgb‘s closing remark in this post (bold mine):
To be sure, our neuroscience tools are way less powerful than we would like, and we know far less about the brain than we would like. To be sure, our conceptual frameworks for thinking about sentience seem shaky and open to revision. Even so, trying to actually solve the problem by constructing computational theories which try to explain the full range of phenomena could pay significant dividends. My attitude towards the science of consciousness is similar to Derek Parfit’s attitude towards ethics: since we have only just begun the attempt, we can be optimistic.
Some of the other content I’ve read/skimmed feels like it’s pointing in useful directions on these fronts (and more recommendations are welcome!):
I’m continuing my quick take spree, with a big caveat that I’m not a philosopher and haven’t read nearly as much research/writing on digital minds as I want to.
And I’m not representing Forethought here! I don’t know what Forethought folks think about what I’m writing here.
The more appropriate term, I think, is “moral patienthood.” And we probably care about the degree of moral consideration that is merited in different cases.
This paper on “Sharing the world with digital minds” mentions this as one of the many failure modes — note that I’ve only skimmed it.
I’m often struck by how appropriate the War with the Newts is as an analogy/illustration/prediction for a bunch of what I’m seeing today, including on issues related to digital minds. But there’s one major way in which the titular newts differ significantly from potential morally relevant digital minds; the newts are quite similar to humans and can, to some extent, communicate their preferences in ways the book’s humans could understand if they were listening.
(One potential exception: there’s a tragic-but-brilliant moment in which “The League for the Protection of Salamanders” calls women to action sewing modest skirts and aprons for the Newts in order to appease the Newts’ supposed sense of propriety.)
This list of propositions touches on related questions/ideas.
See e.g. here:
Whether instrumentally-justified or independently binding, the rights that some AI systems could be entitled to might be different from the rights that humans are entitled to. This could be because, instrumentally, a different set of rights for AI systems promotes welfare. For example, as noted by Shulman and Bostrom (2021), naively granting both “reproductive” rights and voting rights to AI systems would have foreseeably untenable results for existing democratic systems: if AI systems can copy themselves at will, and every copy gets a vote, then elections could be won via tactical copying. This set of rights would not promote welfare and uphold institutions in the same way that they do for humans. Or AI rights could differ because, independently of instrumental considerations, their different properties entitle them to different rights—analogously to how children and animals are plausibly entitled to different rights than adults.
I think this is a good question, and it's something I sort of wanted to look into and then didn't get to! (If you're interested, I might be able to connect you with someone/some folks who might know more, though.)
Quick general takes on what private companies might be able to do to make their tools more useful on this front (please note that I'm pretty out of my depth here, so take this with a decent amount of salt -- and also this isn't meant to be prioritized or exhaustive):
(Note also there's a pretty huge set of consultancies that focus on helping companies sell to the government, but the frame is quite different.)
And then in terms of ~market gaps, I'm again very unsure, but expect that (unsurprisingly) lower-budget agencies will be especially undersupplied — in particular the DOD has a lot more funding and capacity for this kind of thing — so building things for e.g. NIST could make sense. (Although it might be hard to figure out what would be particularly useful for agencies like NIST without actually being at NIST. I haven't really thought about this!)
I haven't looked into this at all, but given the prevalence of Microsoft systems (Azure etc.) in the US federal government (which afaik is greater than what we see in the UK), I wonder if Microsoft's relationship with OpenAI explains why we have ChatGPT Gov in the US, while Anthropic is collaborating with the UK government https://www.anthropic.com/news/mou-uk-government