At least for a moment in time, I was GWWC's 10,000th active pledger.
Of course. Many of the these techniques are specific to certain parts of the risk assessment process. The document is unfortunately paywalled, but risk assessment can be said to have these three parts:
Risk treatment is missing here becacuse it's sort of a separate process outside of risk assessment (but is tied to risk evaluation), and ISO 31010 specifically addresses the risk assessement phase. "Brainstorming" as a class of techniques and can take different shapes and form (in my previous work we used to have structured sessions to generate "what-if scenarios"), and they specifically address mainly 1 and some of 2a above.Â
But as Jan pointed out in his comment, perhaps safety cases are a meta-framework and not the technique itself, so the quality of a safety case depends on the quality of the evidence put forth alongside the arguments, and this quality may be related to the suitability and implementation of the specific techniques used to generate the evidence.
I'm curious how and why safety cases got so popular in AI safety. There are so many other risk assessment techniques out there, for reference ISO31010 lists 30 of them (see here) and they're far from exhaustive. My instinct is that it's because safety cases are purely text-based, easily understandable, and does not require proper risk management concepts (e.g. hazards, events, consequences) for it to work; so at some point perhaps someone suggested to use it and the rest of the field just went with it.Â
(also I don't know how common safety cases are in safety engineering, in my decade in the oil and gas industry I have never heard of it before, though it may be more common in other industries)
The defense in depth thesis is that you are best off investing some resources from your limited military budget in many different defenses (e.g. nuclear deterrence; intelligence gathering and early warning systems; an air force, navy and army; command and communication bunkers; diplomacy and allies) rather than specialising heavily in just one.
I'm not familiar with how this concept is used in the military, but in safety engineering I've never heard of it as a tradeoff between 'many layers, many holes' vs 'one layer, few holes'. The swiss cheese model is often meant to illustrate the fact that your barriers are often not 100% effective, so even if you think you have a great barrier, you should have more than one of it. From this perspective, the concept of having multiple barriers is straightforwardly good and doesn't imply justifying the use of weaker barriers.Â
I would be interested to hear the counterpoints from those who have Disagree-voted on this post.Â
Likewise!
Do you think the "very particular worldview" you describe is found equally among those working on technical AI safety and AI governance/policy? My impression is that policy inherently requires thinking through concrete pathways of how AGI would lead to actual harm as well as greater engagement with people outside of AI safety.Â
I think they're quite prevalent regardless. While some people's roles indeed require them to analyze concrete pathways more than others, the foundation of their analysis is often implicitly built upon this worldview in the first place. The result is that their concrete pathways tend to be centred around some kind of misaligned AGI, just in much more detail. Conversely, someone with a very different worldview who does such an analysis might end up with concrete pathways centred around severe discrimination of marginalized groups.Â
I have also noticed a split between the "superintelligence will kill us all" worldview (which you seem to be describing) and "regardless of whether superintelligence kills us all, AGI/TAI will be very disruptive and we need to manage those risks" (which seemed to be more along the lines of the Will MacAskill post you linked to - especially as he talks about directing people to causes other than technical safety or safety governance).
There are indeed many different "sub-worldviews", and I was kind of lumping them all under one big umbrella. To me, the most defining characteristic of this worldview is AI-centrism, and treating the impending AGI as an extremely big deal â not just like any other big deals we have seen before, but this will be unprecedented. Those within this overarching worldview would differ in terms of the details, e.g. will it kill everyone? or will it just lead to gradual disempowerment? are LLMs getting us to AGI? or is it some yet-to-be-discovered architecture? should we focus on getting to AGI safely? or start thinking more about the post-AGI world? I think many people move between these "sub-worldviews" as they see evidences that update their priors, but way fewer people move out of this overarching worldview entirely.Â
(semi-commitment for accountability)
I'm considering writing more about how a big part of AI safety seem to be implicitly built upon an underlying worldview and we have rarely challenged that worldview.Â
Yeah 1 and 3 seems right to me, thanks.Â
On 2, I think there are quite a number of techniques that give you quantitative risk estimates, and it's quite routine in safety engineering and often required (e.g. to demonstrate that you have achieved 1e-4 fatality threshold and any further risk reduction is impractical). I don't fully understand most of the techniques listed in ISO31010, but it seems that a number of them do give quantitative risk estimates as a result the risk evaluation process, e.g. monte carlo, bayesian networks, F/N diagrams, VaR, toxicological risk assessment, etc.Â
If you haven't already seen this paper on risk modelling, they use FTA and bayesian networks to estimate risks quantitatively.Â