I agree there are some possible attitudes that society could have towards AI development which could put us in a much safer position.
I think that the degree of consensus you'd need for the position that you're outlining here is practically infeasible, absent some big shift in the basic dynamics. I think that the possible shifts which might get you there are roughly:
I think there's potentially something to each of these. But I think the GDM paper is (in expectation) actively helpful for 1 and probably 3, and doesn't move the needle much either way on 2.
(My own view is that 3 is the most likely route to succeed. There's some discussion of the pragmatics of this route in AI Tools for Existential Security or AI for AI Safety (both of which also discuss automation of safety research, which is another potential success route), and relevant background views on the big-picture strategic situation in the Choice Transition. But I also feel positive about people exploring routes 1 and 2.)
I agree that there could be an effect that keeps people from speaking out about AI danger. But:
I downvoted this (but have upvoted some of your comments).
I think this advice is at minimum overstated, and likely wrong and harmful (at least if taken literally). And it's presented with rhetorical force, so that it seems to mostly be trying to push people's views towards a position that is (IMO) harmful, rather than mostly providing them with information to help them come to their own conclusions.
TBC:
Which applications to focus on: I agree that epistemic tools and coordination-enabling tools will eventually have markets and so will get built at some point absent intervention. But this doesn't feel like a very strong argument -- the whole point is that we may care about accelerating applications even if it's not by a long period. And I don't think that these will obviously be among the most profitable applications people could make (especially if you can start specializing to the most high-leverage epistemic and coordination tools).
Also, we could make a similar argument that "automated safety" research won't get dropped, since it's so obviously in the interests of whoever's winning the race.
UI and complementary technologies: I'm sort of confused about your claim about comparative advantage. Are you saying that there aren't people in this community whose comparative advantage might be designing UI? That would seem surprising.
More broadly, though:
Compute allocation: mostly I think that "get people to care more" does count as the type of thing we were talking about. But I think that it's not just caring about safety, but also being aware ahead-of-time of the role that automated research may have to play in this, and when it may be appropriate to hit the gas and allocate a lot of compute to particular areas.
Training data: I agree that the stuff you're pointing to seems worthwhile. But I feel like you've latched onto a particular type of training data, and you're missing important categories, e.g.:
It seems like "what can we actually do to make the future better (if we have a future)?" is a question that keeps on coming up for people in the debate week.
I've thought about some things related to this, and thought it might be worth pulling some of those threads together (with apologies for leaving it kind of abstract). Roughly speaking, I think that:
There are some other activities which might help make the future better without doing so much to increase the chance of having a future, e.g.:
However, these activities don't (to me) seem as high leverage for improving the future as the more mixed-purpose activities.
Ughh ... baking judgements about what's morally valuable into the question somehow doesn't seem ideal. Like I think it's an OK way to go for moral ~realists, but among anti-realists you might have people persistently disagreeing about what counts as extinction.
Also like: what if you have a world which is like the one you describe as an extinction scenario, but there's a small amount of moral value in some subcomponent of that AI system. Does that mean it no longer counts as an extinction scenario?
I'd kind of propose instead using the typology Will proposed here, and making the debate between (1) + (4) on the one hand vs (2) + (3) on the other.
These are in the same category because:
I'm not actually making a claim about alignment difficulty -- beyond that I do think systems in the vein of those today and the near-successors of those look pretty safe.
I think that getting people to pause AI research would be a bigger lift than any nonproliferation treaties we've had in the past (not that such treaties have always been effective!). This isn't just a military tech, it's a massively valuable economic tech. Given the incentives, and the importance of having treaties actually followed, I do think this would be a more difficult challenge than any past nonproliferation work. I don't think that means it's impossible, but I do think it's way more likely if something shifts -- hence my 1-3.
(Or if you were asking why I say "out of reach now" in the quoted sentence it's because I'm literally talking about "much better coordination" as a capability; not what could or couldn't be achieved with a certain level of coordination.)