I’ve written a draft report evaluating a version of the overall case for existential risk from misaligned AI, and taking an initial stab at quantifying the risk from this version of the threat. I’ve made the draft viewable as a public google doc here (Edit: arXiv version here, video presentation here, human-narrated audio version here). Feedback would be welcome.
This work is part of Open Philanthropy’s “Worldview Investigations” project. However, the draft reflects my personal (rough, unstable) views, not the “institutional views” of Open Philanthropy.
Hey Joe!
Great report, really fascinating stuff. Draws together lots of different writing on the subject, and I really like how you identify concerns that speak to different perspectives (eg to Drexler's CAIS and classic Bostrom superintelligence).
Three quick bits of feedback:
Which is, unfortunately, a pretty key premise and the one I have the most questions about! My impression is that section 6.3 is where that argumentation is intended to occur, but I didn't leave it with a sense of how you thought this would scale, disempower everyone, and be permanent. Would love for you to say more on this.
Presumably we should also be worried about a small group doing this as well? For example, consider a scenario in which a powerhungry small group, or several competing groups, use aligned AI systems with advanced capabilities (perhaps APS, perhaps not) to the point of permanently disempowering ~all of humanity.
If I went through and find-replaced all the "PS-misaligned AI system" with "power-hungry small group", would it read that differently? To borrow Tegmark's terms, does it matter if its Omega Team or Prometheus?
I'd be interested in seeing some more from you about whether you're also concerned about that scenario, whether you're more/less concerned, and how you think its different from the AI system scenario.
Again, really loved the report, it is truly excellent work.
Hi Hadyn,
Thanks for your kind words, and for reading.
- Thanks for pointing out these pieces. I like the breakdown of the different dimensions of long-term vs. near-term.
- Broadly, I agree with you that the document could benefit from more about premise 5. I’ll consider revising to add some.
- I’m definitely concerned about misuse scenarios too (and I think lines here can get blurry -- see e.g. Katja Grace’s recent post); but I wanted, in this document, to focus on misalignment in particular. The question of how to weigh misuse vs. misalignment r
... (read more)