4433 karmaJoined


EA advertisements
Courting Virgo
EA Gather Town
Improving EA tech work


Topic contributions

Off the top of my head:

* Any moral philosophy arguments that imply the long term future dwarfs the present in significance (though fwiw I agree with these arguments - everything below I have at least mixed feelings on)

* The classic arguments for 'orthogonality' of intelligence and goals (which I criticised here)

* The classic arguments for instrumental convergence towards certain goals

* Claims about the practical capabilities of future AI agents 

* Many claims about the capabilities of current AI agents, e.g. those comparing them to intelligent high schoolers/university students (when you can quickly see trivial ways in which they're nowhere near the reflexivity of an average toddler)

* Claims that working on longtermist-focused research is likely to be better for the long term than working on nearer term problems

* Claims that, within longtermist-focused research, focusing on existential risks (in the original sense, not the very vague 'loss of potential sense') is better than working on ways to make he long term better conditional on it existing (or perhaps looking for ways to do both)

* Metaclaims about who should be doing such research, e.g. on the basis that they published other qualitative arguments that we agree with

* Almost everything on the list linked in the above bullet

Do you mean examples of such conversations, of qualitative arguments for those positions, or of danger to epistemics from the combination 

In group “consensus” could just be selection effects

I worry that this is an incredibly important factor in much EA longtermist/AI safety/X-risk thinking. In conversation, EAs don't seem to appreciate how qualitative the arguments for these concerns are  (even if you assume a totalising population axiology) and how dangerous to epistemics that is in conjunction with selection effects.

Thanks! I wrote a first draft a few years ago, but I wanted an approach that leaned on intuition as little as possible if at all, and ended up thinking my original idea was untenable. I do have some plans on how to revisit it and would love to do so once I have the bandwidth.

I remember finding getting started really irritating. I just gave you the max karma I can on your two comments. Hopefully that will get you across the threshold.

Sorry, I should say either that or imply some at-least-equally-dramatic outcome (e.g. favouring immediate human extinction in the case of most person-affecting views). Though I also think there's convincing interpretations of such views in which they still favour some sort of shockwave, since they would seek to minimise future suffering throughout the universe, not just on this planet.

more on such appearances here

I'll check this out if I ever get around to finishing my essay :) Off the cuff though, I remain immensely sceptical that one could usefully describe 'preference as basically an appearance of something mattering, being bad, good, better or worse' in such a way that such preferences could be

a. detachable from conscious, and

b. unambiguous in principle, and

c. grounded in any principle that is universally motivating to sentient life (which I think is the big strength of valence-based theories)

Notice that Shulman does not say anything about AI consciousness or sentience in making this case. Here and throughout the interview, Shulman de-emphasizes the question of whether AI systems are conscious, in favor of the question of whether they have desires, preferences, interests. 

I'm a huge fan of Shulman in general, but on this point I find him quasi-religious. He once sincerely described hedonistic utilitarianism as 'a doctrine of annihilation' on the grounds (I assume) that it might advocate tiling the universe with hedonium - ignoring that preference-based theories of value either reach the same conclusions or have a psychopathic disregard for the conscious states sentient entities do have. I've written more about why here

So “existential catastrophe” probably shouldn't just mean "human extinction". But then it surprisingly slippery as a concept. Existential risk is the risk of existential catastrophe, but it's difficult to give a neat and intuitive definition of “existential catastrophe” such that “minimise existential catastrophe” is a very strong guide for how to do good. Hilary Greaves dicusses candidate definitions here.


Tooting my own trumpet, I did a lot of work on improving the question x-riskers are asking in this sequence.

METR is hiring ML engineers and researchers to drive these AI R&D evaluations forward.


These links both say the respective role is now closed.

Load more