MC

Max Clarke

297 karmaJoined Northland, Wellington, New Zealand

Comments
87

By the way, this is a very good poll! 

I think it's neither necessary nor sufficient for robust alignment. I'm uncertain as to whether it's possible to get some kind of "fragile" alignment from pretraining. I don't think robust alignment requires it, but neither do I think that it doesn't. It definitely doesn't hurt.

Yes, I agree with that statement. However, answer is related to "stability under reflection" - specifically I think you're either in or out of an alignment basin (or, that might not be possible). I think if you're in it, it's not correct to say "partially aligned" - what you've got is something that's aligned. And if you're out of it (or there's no such thing), then what you've got is not aligned. Partial alignment to me means preserving some value only under repeated reflection, which I think is plausibly possible but exponentially unlikely (I'd pick a 99.999% disagree option if it was there, basically)

Max Clarke
1
0
0
100% agree

Multipolar worlds will compete away >90% of net value that would otherwise be preserved

Strongly agree

Max Clarke
2
0
0
70% agree

Alignment to specific values is underrated in research relative to control

Yes, I think control is a waste of time. We need actual alignment to actual (universalized) values.

Max Clarke
1
0
0
100% disagree

Partially aligned transformative AIs are likely to be stable under reflection

I disagree that "partially aligned" is a statement that has meaning here.

Max Clarke
2
0
0
0% agree

Research into digital mind suffering is sufficiently tractable to work on

I don't know.

Max Clarke
4
1
0
70% agree

AI alignment to humans will in practice avoid moral catastrophes to animals

Alignment requires a mechanical understanding of good and bad, and it will be clear how to apply it to animals. Note that wild animal suffering arguments imply that the status quo is likely a moral catastrophe. I believe an aligned entity or system would attempt to change that.

Max Clarke
2
0
0
70% agree

AI alignment to humans will in practice avoid moral catastrophes to digital minds

Likewise, alignment requires a mechanical understanding of good and bad, and it will be clear how to apply it to digital minds.

Max Clarke
1
0
0
0% agree

Robust alignment requires alignment-relevant intervention during pretraining

 

Frankly I neither agree nor disagree with this statement. Robust alignment has nothing to do with the current pre training regime. It should work with or without it.

Load more