[[THIRD EDIT: Thanks so much for all of the questions and comments! There are still a few more I'd like to respond to, so I may circle back to them a bit later, but, due to time constraints, I'm otherwise finished up for now. Any further comments or replies to anything I've written are also still be appreciated!]]
Hi!
I'm Ben Garfinkel, a researcher at the Future of Humanity Institute. I've worked on a mixture of topics in AI governance and in the somewhat nebulous area FHI calls "macrostrategy", including: the long-termist case for prioritizing work on AI, plausible near-term security issues associated with AI, surveillance and privacy issues, the balance between offense and defense, and the obvious impossibility of building machines that are larger than humans.
80,000 Hours recently released a long interview I recorded with Howie Lempel, about a year ago, where we walked through various long-termist arguments for prioritizing work on AI safety and AI governance relative to other cause areas. The longest and probably most interesting stretch explains why I no longer find the central argument in Superintelligence, and in related writing, very compelling. At the same time, I do continue to regard AI safety and AI governance as high-priority research areas.
(These two slide decks, which were linked in the show notes, give more condensed versions of my views: "Potential Existential Risks from Artificial Intelligence" and "Unpacking Classic Arguments for AI Risk." This piece of draft writing instead gives a less condensed version of my views on classic "fast takeoff" arguments.)
Although I'm most interested in questions related to AI risk and cause prioritization, feel free to ask me anything. I'm likely to eventually answer most questions that people post this week, on an as-yet-unspecified schedule. You should also feel free just to use this post as a place to talk about the podcast episode: there was a thread a few days ago suggesting this might be useful.
From the podcast transcript:
I continue to have a lot of uncertainty about how likely it is that AI development will look like "there’s this separate project of trying to figure out what goals to give these AI systems" vs a development process where capability and goals are necessarily connected. (I didn't find your arguments in favor of the latter very persuasive.) For example it seems GPT-3 can be seen as more like the former than the latter. (See this thread for background on this.)
To the extent that AI development is more like the latter than the former, that might be bad news for (a certain version of) the orthogonality thesis, but it can be even worse news for the prospect of AI alignment. Because instead of disaster striking only if we can't figure out the right goals to give to the AI, it can also be the case that we know what goals we want to give it, but due to constraints of the development process, we can't give it those goals and can only build AI with unaligned goals. So it seems to me that the latter scenario can also be rightly described as "exogenous deadline of the creep of AI capability progress". (In both cases, we can try to refrain from developing/deploying AGI, but it may be a difficult coordination problem for humanity to stay in a state where we know how to build AGI but chooses not to, and in any case this consideration cuts equally across both scenarios.)