Stuart Armstrong

127 karmaJoined


I argue that it's entirely the truth, the way that the term is used and understood.

Precisely. And supporting subsidized contraception is a long way away from both the formal definition of eugenics and its common understanding.

I feel that saying "subsidized contraception is not eugenics" is rhetorically better and more accurate than this approach.

Ah, you made the same point I did, but better :-)

>Most people endorse some form of 'eugenics'

No, they don't. It is akin to saying "most people endorse some form of 'communism'." We can point to a lot of overlap between theoretical communism and values that most people endorse; this doesn't mean that people endorse communism. That's because communism covers a lot more stuff, including a lot of historical examples and some related atrocities. Eugenics similarly covers a lot of historical examples, including some atrocities (not only in fascist countries), and this is what the term means to most people - and hence, in practice, what the term means.

Many people endorse screening embryos for genetic abnormalities. The same people would respond angrily if you said they endorsed eugenics; the same way that people who endorse minimum wages would respond angrily if you said they endorsed communism. Eugenics is evil because it descriptively describes something evil; trying to force it into some other technical meaning is incorrect.

Thanks, that makes sense.

I've been aware of those kind of issues; what I'm hoping is that we can get a framework to include these subtleties automatically (eg by having the AI learn them from observations or from human published papers)  without having to put it all in by hand ourselves.

Hey there! It is a risk, but the reward is great :-)

  1. Value extrapolation makes most other AI safety approaches easier (eg interpretability, distillation and amplification, low impact...). Many of these methods also make value extrapolation easier (eg interpretability, logical uncertainty,...). So I'd say the contribution is superlinear - solving 10% of AI safety our way will give us more than 10% progress.
  2. I think it already has reframed AI safety from "align AI to the actual (but idealised) human values" to "have an AI construct values that are reasonable extensions of human values".
  3. Can you be more specific here, with examples from those fields?
  4. I see value extrapolation as including almost all my previous ideas - it would be much easier to incorporate model fragments into our value function, if we have decent value extrapolation.

An AI that is aware that value is fragile will behave in a much more cautious way. This gives a different dynamic to the extrapolation process.

  1. Nothing much to add to the other post.
  2. Imagine that you try to explain to a potential superintelligence that we want it to preserve a world with happy people in it by showing it videos of happy people. It might conclude that it should make people happy. Or it might conclude that we want more videos of happy people. The latter is more compatible with the training that we have given it. The AI will be safer if it hypothesizes that we may have meant the former, despite having given it evidence more compatible with the latter, and pursues both goals rather than merely the latter. This is what we are working towards.   
  3. Value alignment. Good communication and collaboration skills.  Machine learning skills. Smart, reliable, and creative. Good at research. At present we are looking for a Principal ML Engineer and other senior roles.
  4. The ability to move quickly from theory to model to testing the model and back 
Load more