Aim: to give some clarifications about ‘structural risk’ from AI that we have personally found helpful. Most of these draw directly from earlier work by Remco Zwetsloot and Allan Dafoe. We’re sharing them in case they’re also helpful to others.
Audience: people who want more surface area on the concept of ‘structural risk’. Could also be helpful to those interested in sources of AI risk in general.
Acknowledgements: this was written collaboratively with Jess Whittlestone. Many of the clarifications in this post come from this talk by Remco Zwetsloot. Thanks also to Ben Garfinkel for a helpful conversation and Allan Dafoe for feedback on a related piece.
When talking about risks from AI, people often discuss either ‘accident risks’ i.e. risks from AI systems behaving in unintended ways, or ‘misuse risks’, i.e. risks from AI systems being used for some malicious purpose. However, this categorisation misses a great deal: technology tends to have complex and indirect effects, and can cause harm even when no single actor deliberately misuses it and it behaves as intended (e.g. the effect of fossil fuels on climate change). The concept of ‘structural risk’ from AI has been introduced to cover such possibilities.
We believe this is an important point, and broadly agree with the core claims of existing work on structural risk.
However, we have noticed discussion where the concept has been used to refer to somewhat different ideas (e.g. in these forum posts). This doesn't really matter if you're just trying to illustrate the broad point that technology can cause harm even without malicious intent or incompetence, or that analysing the incentives of different actors can reveal important risk reduction interventions.
But if you're wanting to make claims about (e.g.) the proportion of AI x-risk which is structural, or how much to prioritise work on reducing structural AI risk, then it's important to be clear about what concept you're referring to.
In this post, we give some clarifications about structural risk from AI that we hope will improve the rigour of discussion about structural risk, when such rigour is useful.
Structural risk is - first and foremost - intended to be a perspective you can take, rather than a fixed list or category of risks
Taking a structural perspective (or "lens") on some risk means examining how that risk may be caused or influenced by structural factors, i.e. incentives which make actors (even competent and well-intentioned ones) more likely to take actions which result in harm.[1]
To give a simple analogy: if you’re trying to understand and prevent avalanches, a structural perspective would focus on factors such as the steepness of hiking trails, rather than on preventing particular actors from setting off an avalanche. This might be a more effective approach to mitigating risk, because there are many actors who might set it off, and you need to stop all of them to prevent catastrophe, which is probably very hard.[2]
Note that talking about taking a structural “perspective” on risk doesn’t mean there is a fixed list or category of risks that can be described as “structural”.[3]
It doesn’t necessarily mean that “structural risks” are disjoint from "accident" or "misuse" risks. In fact, it can be illuminating to take a structural perspective to understand both accident and misuse risks (as we'll see later). If “structure” is merely a useful perspective or lens for thinking about risk, it also doesn't make sense to talk about (e.g.) the proportion of AI x-risk which is "structural", or how much to prioritise work on reducing "structural risk" (because any risk could be analysed using the structural perspective). Instead, you could talk about how important structural causes are for a given AI x-risk, or how much efforts to shape the incentives of different actors would reduce some AI x-risk, compared to other interventions.
We found this distinction helpful, because we noticed we were getting confused about where the bounds lay around “structural risk” as a category, especially where classically considered accident or misuse risks might have structural causes, such as AI developers having incentives to skimp on safety mechanisms. Thinking of structure as more of a “perspective” that can be illuminating when thinking about risk helped reduce this confusion.
That said, it does sometimes still seem useful to talk about specific types of risk which arise mostly from structural factors.
However, there are (two) interesting categories of AI risk which are illuminated by taking a structural perspective
Note that these categories also aren't disjoint from "misuse" and "accident" risks, nor are they intended to be - they are simply another useful way to carve up the space of risks from AI.[4][5]
AI risks with structural causes
We've already talked about structural causes - incentives which make actors (even competent and well-intentioned ones) more likely to take actions which have bad outcomes. Here are some possible AI risks with structural causes:[6]
- Dangerous tradeoffs between safety and performance
- E.g. the Uber 2018 self-driving car crash, where engineers disabled an emergency brake that they worried would cause the car to behave overly cautiously and look worse than competitor vehicles. This decision to trade off safety for performance led to a crash and the passenger's death.
- Note that this could also be well-described as an "accident risk" (there was some incompetence on behalf of the engineers, along with the structural causes).
- Reputational incentives and/or publication requirements leading to the diffusion of models (or techniques for training them) that can be misused.
- E.g. concerns about the diffusion of large language models which then get used to generate mass misinformation.
- Note that this could also be well-described as a "misuse risk" (there was some malintent on the behalf of those generating the misinformation, along with the structural causes).
- Some slow takeoff AI alignment failure stories in which competitive pressures play a key role in causing AI systems to gradually gain control over the future.
- E.g., What failure looks like part 1 or Another (outer) alignment failure story
- In these stories, competent actors without any malintent are incentivised to gradually deploy and hand over control to increasingly advanced systems, because that's the only way to remain economically and militarily competitive.
- So, note that these risks can play out without any malintent or incompetence - so they are in fact disjoint from misuse and accident risks.
‘Non-AI’ risks partly caused by AI
Some risks that don't really seem to be "AI risks"—in the sense that the proximate cause of harm need not have anything to do with AI—have structural causes related to AI.[7] Some examples:
-
Large language models make it cheaper/easier to create mass misinformation, incentivising bad actors to do so, which erodes epistemic security (e.g. it becomes much harder to trust information online), making coordinated responses to global crises more difficult.[8]
-
AI enables and incentivises faster development in risky areas of science/technology (e.g. biotech and APM), and these technologies get into the hands of bad actors who do a lot of damage.
-
AI improves data collection and processing techniques, allowing states to discover and sabotage each other's (previously secure) nuclear launch facilities. This undermines states' second strike capabilities, and therefore the foundations of nuclear strategic stability (based on mutually assured destruction), making nuclear war more likely.
-
AI increases payoffs from building surveillance systems, leading to an erosion of privacy.
-
AI increases returns to scale in production (e.g. because it makes coordination within companies easier) leading to more monopolistic markets.
Notice that the first two of these risks are caused by some amount of malintent (as well as structural causes), whereas the latter three need not involve any malintent or incompetence (so they are disjoint from misuse and structural risks).
We use ‘structural factors’ and ‘structural causes’ synonymously. ↩︎
Note that this is essentially the same idea as Andrew Critch's concept of a Robust Agent-Agnostic Process. ↩︎
Of course, you could choose to define the category of structural risk as risks where structural causes are especially important - but if so, this should be made clear, and note that the category would have vague boundaries. ↩︎
However - slightly confusingly - there are some specific risks within each category which are neither accidents nor malicious use. So, these two categories can be thought of as overlapping with "misuse" and "accident" risks. We’ll see this in the examples. ↩︎
This section draws very directly from Zwetsloot and Dafoe’s original piece on structural AI risk; we just add some extra examples that were clarifying to us. ↩︎
In the article that introduced the idea of structural risks from AI, this category of risk was called “Structure’s effect on AI”. ↩︎
In the article that introduced the idea of structural risks from AI, this category of risk was called “AI’s effect on structure”. ↩︎
For a relevant precedent to this kind of risk, it’s plausible that a lack of credible bipartisan information sources increased vaccine and mask hesitancy in Covid-19. ↩︎
Thanks for writing this. I continue to be deeply frustrated by the "accident vs. misuse" framing.
In fact, one I am writing this comment because I think this post itself endorses that framing to too great an extent. For instance, I do not think it is appropriate to describe this simply as an accident:
I have a hard time imagining that they didn't realize this would likely make the cars less safe; I would say they made a decision to prioritize 'looking good' over safety, perhaps rationalizing it by saying it wouldn't make much difference and/or that they didn't have a choice because their livelihoods were at risk (which perhaps they were).
Now that I've got the whinging out of the way, I say thank you again for writing it, and that I found the distinction between "AI risks with structural causes" and "‘Non-AI’ risks partly caused by AI" quite valuable, and I hope it will be widely adopted.
Probably agree with you there
Also agree with that. I wasn't trying to claim it is simply an accident—there are also structural causes (i.e. bad incentives). As I wrote:
If I were writing this again, I wouldn't use the word "well-described" (unclear what I actually mean; sounds like I'm making a stronger claim than I was). Maybe I'd say "can partly be described as an accident".
But today, I think this mostly just introduces unnecessary/confusing abstraction. The main important point in my head now is: when stuff goes wrong, it can be due to malintent, incompetence, or the incentives. Often it's a complicated mixture of all three. Make sure your thinking about AI risk takes that into account.
And sure, you could carve up risks into categories, where you're like:
But it's pretty unclear what "mostly" means, and moreover it just feels kind of unnecessary/confusing.
I recently learned that in law, there is a breakdown as:
This seems like a good categorization.