636 karmaJoined


Speaking broadly, I think people underestimate the tractability of this class of work, since we’re already doing this sort of inquiry under different labels. E.g.,

  1. Nick Bostrom coined, and Roman Yampolskiy has followed up on, the Simulation Hypothesis, which is ultimately a Deist frame;
  2. I and others have written various inquiries about the neuroscience of Buddhist states (“neuroscience of enlightenment” type work);
  3. Robin Hanson has coined and offered various arguments around the Great Filter.

In large part, I don’t think these have been supported as longtermist projects, but it seems likely to me that there‘s value in pulling these strings, and each is at least directly adjacent to theological inquiry.

Great post, and agreed about the dynamics involved. I worry the current EA synthesis has difficulty addressing this class of criticism (power corrupts; transactional donations, geeks/mops/sociopaths), but perhaps we haven’t seen EA’s final form.

As a small comment, I believe discussions of consciousness and moral value tend to downplay the possibility that most consciousness may arise outside of what we consider the biological ecosystem.

It feels a bit silly to ask “what does it feel like to be a black hole, or a quasar, or the Big Bang,” but I believe a proper theory of consciousness should have answers to these questions.

We don’t have that proper theory. But I think we can all agree that these megaphenomena involve a great deal of matter/negentropy and plausibly some interesting self-organized microstructure- though that’s purely conjecture. If we’re charting out EV, let’s keep the truly big numbers in mind (even if we don’t know how to count them yet).

Thank you for this list. 

#2:  I left a comment on Matthew’s post that I feel is relevant: https://forum.effectivealtruism.org/posts/CRvFvCgujumygKeDB/my-current-thoughts-on-the-risks-from-seti?commentId=KRqhzrR3o3bSmhM7c

#16: I gave a talk for Mathematical Consciousness Science in 2020 that covers some relevant items: I’d especially point to 7,8,9,10 in my list here: https://opentheory.net/2022/04/it-from-bit-revisited/

#18+#20: I feel these are ultimately questions for neuroscience, not psychology. We may need a new sort of neuroscience to address them. (What would that look like?)

I posted this as a comment to Robin Hanson’s “Seeing ANYTHING Other Than Huge-Civ Is Bad News” —


I feel these debates are too agnostic about the likely telos of aliens (whether grabby or not). Being able to make reasonable conjectures here will greatly improve our a priori expectations and our interpretation of available cosmological evidence.

Premise 1: Eventually, civilizations progress until they can engage in megascale engineering: Dyson spheres, etc.

Premise 2: Consciousness is the home of value: Disneyland with no children is valueless.

Premise 2.1: Over the long term we should expect at least some civilizations to fall into the attractor of treating consciousness as their intrinsic optimization target.

Premise 3: There will be convergence that some qualia are intrinsically valuable, and what sorts of qualia are such.

Conjecture: A key piece of evidence for discerning the presence of advanced alien civilizations will be megascale objects which optimize for the production of intrinsically valuable qualia.

Speculatively, I suspect black holes and pulsars might fit this description.






Reasonable people can definitely disagree here, and these premises may not work for various reasons. But I’d circle back to the first line: I feel these debates are too agnostic about the likely telos of aliens (whether grabby or not). In this sense I think we’re leaving value on the table.

Great, thank you for the response.

On (3) — I feel AI safety as it’s pursued today is a bit disconnected from other fields such as neuroscience, embodiment, and phenomenology. I.e. the terms used in AI safety don’t try to connect to the semantic webs of affective neuroscience, embodied existence, or qualia. I tend to take this as a warning sign: all disciplines ultimately refer to different aspects of the same reality, and all conversations about reality should ultimately connect. If they aren’t connecting, we should look for a synthesis such that they do.

That’s a little abstract; a concrete example would be the paper “Dissecting components of reward: ‘liking’, ‘wanting’, and learning” (Berridge et al. 2009), which describes experimental methods and results showing that ‘liking’, ‘wanting’, and ‘learning’ can be partially isolated from each other and triggered separately. I.e. a set of fairly rigorous studies on mice demonstrating they can like without wanting, want without liking, etc. This and related results from affective neuroscience would seem to challenge some preference-based frames within AI alignment, but it feels there‘s no ‘place to put’ this knowledge within the field. Affective neuroscience can discover things, but there’s no mechanism by which these discoveries will update AI alignment ontologies.

It’s a little hard to find the words to describe why this is a problem; perhaps that not being richly connected to other fields runs the risk of ‘ghettoizing‘ results, as many social sciences have ‘ghettoized’ themselves.

One of the reasons I’ve been excited to see your trajectory is that I’ve gotten the feeling that your work would connect more easily to other fields than the median approach in AI safety.

  1. What do you see as Aligned AI’s core output, and what is its success condition? What do you see the payoff curve being — i.e. if you solve 10% of the problem, do you get [0%|10%|20%] of the reward?
  2. I think a fresh AI safety approach may (or should) lead to fresh reframes on what AI safety is. Would your work introduce a new definition for AI safety?
  3. Value extrapolation may be intended as a technical term, but intuitively these words also seem inextricably tied to both neuroscience and phenomenology. How do you plan on interfacing with these fields? What key topics of confusion within neuroscience and phenomenology are preventing interfacing with these fields?
  4. I was very impressed by the nuance in your “model fragments” frame, as discussed at some past EAG. As best as I can recall, the frame was: that observed preferences allow us to infer interesting things about the internal models that tacitly generate these preferences, that we have multiple overlapping (and sometimes conflicting) internal models, and that it is these models that AI safety should aim to align with, not preferences per se. Is this summary fair, and does this reflect a core part of Aligned AI’s approach?

Finally, thank you for taking this risk.

I consistently enjoy your posts, thank you for the time and energy you invest.

Robin Hanson is famous for critiques in the form of “X isn’t about X, it’s about Y.” I suspect many of your examples may fit this pattern. To wit, Kwame Appiah wrote that “in life, the challenge is not so much to figure out how best to play the game; the challenge is to figure out what game you’re playing.” Andrew Carnegie, for instance, may have been trying to maximize status, among his peers or his inner mental parliament. Elon Musk may be playing a complicated game with SpaceX and his other companies. To critique assumes we know the game, but I suspect we only have a dim understanding of ”the great game” as it’s being played today.

When we see apparent dysfunction, I tend to believe there is dysfunction, but more deeper in the organizational-civilizational stack than it may appear. I.e. I think both Carnegie and Musk were/are hyper-rational actors responding to a very complicated incentive landscape.

That said, I do think ideas get lodged in peoples’ heads, and people just don’t look. Fully agree with your general suggestion, “before you commit yourself to a lifetime’s toil toward this goal, spend a little time thinking about the goal.”

That said— I’m also loathe to critique doers too harshly, especially across illegible domains like human motivation. I could see how more cold-eyed analysis could lead to wiser aim in what things to build; I could also see it leading to fewer great things being built. I can’t say I see the full tradeoffs at this point.

Most likely infectious diseases also play a significant role in aging- have seen some research suggesting that major health inflection points are often associated with an infection.

I like your post and strongly agree with the gist.

DM me if you’re interested in brainstorming alternatives to the vaccine paradigm (which seems to work much better for certain diseases than others).

Generally speaking, I agree with the aphorism “You catch more flies with honey than vinegar;”

For what it’s worth, I interpreted Gregory’s critique as an attempt to blow up the conversation and steer away from the object level, which felt odd. I’m happiest speaking of my research, and fielding specific questions about claims.

Load more