Comments4
Sorted by Click to highlight new comments since:

I think an extremely dangerous failure mode for AI safety research would be to prioritize 'hardcore-quantitative people' trying to solve AI alignment using clever technical tricks, without understanding much about the 8 billion ordinary humans they're trying to align AIs with. 

If behavioral scientists aren't involved in AI alignment research -- and as skeptics about whether 'alignment' is even possible -- it's quite likely that whatever 'alignment solution' the hardcore-quantitative people invent is going to be brittle, inadequate, and risky.

Trevor -- yep, reasonable points.  Alignment might be impossible, but it might be extra impossible with complex black box networks with hundreds of billions of parameters & very poor interpretability tools.

Curated and popular this week
Relevant opportunities