There seem to be two main framings emerging from recent AGI x-risk discussion: default doom, given AGI, and default we're fine, given AGI.
I'm interested in what people who have low p(doom|AGI) think are the reasons that things will basically be fine once we have AGI (or TAI, PASTA, ASI). What mechanisms are at play? How is alignment solved so that there are 0 failure modes? Can we survive despite imperfect alignment? How? Is alignment moot? Will physical limits be reached before there is too much danger?
If you have high enough p(doom|AGI) to be very concerned, but you're still only at ~1-10%, what is happening in the other 90-99%?
Added 22Apr: I'm also interested in detailed scenarios and stories, spelling out how things go right post-AGI. There are plenty of stories and scenarios illustrating doom. Where are the similar stories illustrating how things go right? There is the FLI World Building Contest, but that took place in the pre-GPT-4+AutoGPT era. The winning entry has everyone acting far too sensibly in terms of self-regulation and restraint. I think we can now say, given the fervour over AutoGPT, that this will not happen, with high likelihood.
A key part of my model right now relies on who develops the first AGI and on how many AGIs are developed.
If the first AGI is developed by OpenAI, Google DeepMind or Anthropic - all of whom seem relatively cautious (perhaps some more than others) - I put the chance of massively catastrophic misalignment at <20%.
If one of those labs is first and somehow able to prevent other actors from creating AGI after this, then that leaves my overall massively catastrophic misalignment risk at <20%. However, while I think it's likely one of these labs would be first, I'm highly uncertain about whether they would achieve the pivotal outcome of preventing subsequent AGIs.
So, if some less cautious actor overtakes the leading labs, or if the leading lab who first develops AGI cannot prevent many others from building AGI afterward, I view there's a much higher likelihood of massively catastrophic misalignment from one of these attempts to build AGI. In this scenario, my overall massively catastrophic misalignment risk is definitely >50%, and perhaps closer to the 75%~90% range.
This just seems like a hell of a reckless gamble to me. And you have to factor in their massive profit-making motivation. Is this really much more than mere safetywashing?
Or, y'know, you could just not build... (read more)