Got sent a set of questions from ARBOx to handle async; thought I'd post my answers publicly:
lfg!
Following on from this post:
A few more things I often say that obliquely relate to networking:
Great post! Was just thinking about an intuition pump of my own re: EV earlier today, and it has a similar backdrop, of vaccine development. Also, you gave me a line with which to lead into it:
The work I do doesn't end up helping other researchers get closer to coming up with a cure.
Oh but it could have helped! It probably does (but there are exceptions like if your work is heavily misguided to the degree that nobody would have worked on it, or is gated).
By doing the work and showing it doesn't lead to a cure, you're freeing someone else who would have done that work to do some other work instead. Assuming they would still be searching for a cure, you've increased the probability that the remaining researchers do in fact find a cure.
I encounter "in 99.9% of worlds, I end up making no progress" a lot in my work, and I offer in its place that it is important and valuable to chase down many different bets to their conclusions, that the vaccine is not developed by a single party alone in isolation from all the knowledge being generated around them, but through the collected efforts of thousands of failed attempts from as many groups. The victor can claim only the lion's share of the credit, not all of it; every (plausible) failed attempted gets some part of the value generated from the endeavour as a whole, even ex post.
"anyone" is a high bar! Maybe worth looking at what notable orgs might want to fund, as a way of spotting "useful safety work not covered by enough people"?
I notice you're already thinking about this in some useful ways, nice. I'd love to see a clean picture of threat models overlaid with plans/orgs that aim to address them.
I think the field is changing too fast for any specific claim here to stay true in 6-12m.
Signal boost: Check out the "Stars" and "Follows" on my github account for ideas of where to get stuck into AI safety.
A lot of people want to understand AI safety by playing around with code and closing some issues, but don't know where to find such projects. So I've recently starting scanning github for AI safety relevant projects and repositories. I've starred some, and followed some orgs/coders there as well, to make it easy for you to find these and get involved.
Excited to get more suggestions too! Feel to comment here, or send them to me at sk@80000hours.org
Thanks. I sort of don't buy that that's what the Mechanize piece says, and in any case "no matter what you do" sounds a bit fatalistic, similar to death. Sure, we all die, but does that really mean we shouldn't try and live healthier for longer?
Not directly relating to your claim, but:
The Mechanize piece claims "Full automation is desirable", which I don't think I agree with both a priori and after reading their substantiation. It does not contend with the possibilities of catastrophic risks from fully automating, say, bioweapon research and development; it might be inevitable, but on desirability I think it's clear that it's only desirable once -- at the bare minimum -- substantial risks have been planned for and/or suitably mitigated. It's totally reasonable to delay the inevitable!
Thanks Matt. Good read.
A stronger technological determinism tempers this optimism by saying that the kinds of minds you get will be whichever are easiest to build or maintain, and that those quite-specific minds will dominate no matter what you do.
Is there a thing you would point to that substantiates or richly argues for this claim? It seems non-obvious to me.
I try to maintain this public doc of AI safety cheap tests and resources, although it's due a deep overhaul.
Suggestions and feedback welcome!
Weekly Prompts
Recently, an advisee told me that they've been procrastinating on replying to my email. It sits at the top of their stack each week. When they try to reply, instead they act on the prompts within, and so no longer need to correspond with me for the time-being.
They run this in a loop, and keep moving forward.
My email:
Consider setting up such prompts for your own weekly check-ins. Let me know some of your most effective prompts in the comments!