Hello Effective Altruism Forum, I am Nate Soares, and I will be here to answer your questions tomorrow, Thursday the 11th of June, 15:00-18:00 US Pacific time. You can post questions here in the interim.
Last week Monday, I took the reins as executive director of the Machine Intelligence Research Institute. MIRI focuses on studying technical problems of long-term AI safety. I'm happy to chat about what that means, why it's important, why we think we can make a difference now, what the open technical problems are, how we approach them, and some of my plans for the future.
I'm also happy to answer questions about my personal history and how I got here, or about personal growth and mindhacking (a subject I touch upon frequently in my blog, Minding Our Way), or about whatever else piques your curiosity. This is an AMA, after all!
EDIT (15:00): All right, I'm here. Dang there are a lot of questions! Let's get this started :-)
EDIT (18:00): Ok, that's a wrap. Thanks, everyone! Those were great questions.
So as I understand it, what MIRI is doing now is to think about theoretical issues and strategies and write papers about this, in the hope that the theory you develop can be made use of by others?
Does MIRI think of ever:
Also (feel free to skip this part of the question if it is too big/demanding):
Personally, I have a goal of progressing the field of computer-assisted proofs by making them more automated and by making the process of making them more user-friendly. The system would be made available through a website where people can construct proofs and see the proofs, but the components of the system would also be made available for use elsewhere. One of the goals would be to make it possible and practical to construct claims that are in natural language and are made using components of natural language, but also have an unambiguous logical notation (probably in Martin-Löf type theory). The hope would be that this could be used for rigorous proofs about self-inproving AI, and that the technologies/code-base developed and the vocabulary/defnitions/claims/proofs in the system could be of use for a goal-alignment/safy-framework.
(Anyone reading this who are interested in hearing more, could get in touch with me, and/or take a look at this document:
https://docs.google.com/document/d/1GTTFO7RgEAJxy8HRUprCIKZYpmF4KJiVAGRHXF_Sa70/edit)
If I got across what it is that I'm hoping to make; does it sound like this could be useful to the field of AI safety / goal alignment? Or are you unsure? Or does it seem like my understanding of what the field needs is flawed to some degree, and that my efforts in all probability would be better spent elsewhere?
Kinda. The current approach is more like "Pretend you're trying to solve a much easier version of the problem, e.g. where you have a ton of computing power and you're trying to maximize diamond instead of hard-to-describe values. What parts of the problem would you still not know how to solve? Try to figure out how to solve those first."
(1) If we manage to (a) generate a theory of advanced agents under many simplifying assumptions, and then (b) generate a theory of bounded rational agents under far fewer simplifying assumptions, and then (c) figu... (read more)