The more I think about "counterfactual robustness", the more I think consciousness is absurd.
Counterfactual robustness implies for animal (including human) brains, that even if the presence of a neuron (or specific connection) during a given sequence of brain activity didn't affect that sequence (i.e. the neuron didn't fire, and if it had disappeared, the activity would have been the same), the presence of that neuron can still matter for what exactly was experienced and whether anything was experienced at all, because it could have made a difference in counterfactual sequences that didn't happen. That seems unphysical, since we're saying that even if something made no actual physical difference, it can still make a difference for subjective experience. And, of course, since those neurons don't affect neural activity by hypothesis, their disappearance during those sequences where they have no influence wouldn't affect reports, either! So, no physical influence and no difference in reports. How could those missing neurons possibly matter during that sequence? Einstein derided quantum entanglement as "spooky action at a distance". Counterfactual robustness seems like "spooky action from alternate histories".
To really drive the point home: if you were being tortured, would it stop feeling bad if those temporarily unused neurons (had) just disappeared, but you acted no differently? Would you be screaming and begging, but unconscious?
So, surely we must reject counterfactual robustness. Then, it seems that what we're left with is that your experiences are reducible to just the patterns of actual physical events in your brain, probably roughly reducible to your neurons that actually fired and the actual signals sent between them. So, we should be some kind of identity theorist.
But neurons don't seem special, and if you reject counterfactual robustness, then it's hard to see how we wouldn't find consciousness everywhere, and not only that, but maybe even human-like experiences, like the feeling of being tortured, could be widespread in mundane places, like in the interactions between particles in walls. That seems very weird and unsettling.
Maybe we're lucky and the brain activity patterns responsible for morally relevant experiences are complex enough that they're rare in practice, though. That would be somewhat reassuring if it turns out to be true, but I'd also like whether or not my walls are being tortured to not depend too much on how many particles there are and how much they interact in basically random ways. My understanding is that the ways out (without accepting the unphysical) that don't depend on this kind of empirical luck require pretty hard physical assumptions (or even worse, biological substrationist assumptions 🤮) that would prevent us from recognizing beings functionally equivalent and overall very similar to humans as conscious, which also seems wrong. Type identity theory with detailed types (e.g. involving actual biological neurons) is one such approach.
But if we really want to be able to abstract away almost all of the physical details and just look at a causal chain of events and interactions, then we should go with a fairly abstract/substrate-independent type identity theory go with something like token identity theory/anomalous realism and accept the possibility of tortured walls. Or, we could accept a huge and probably ad hoc disjunction of physical details, "mental states such as pain could eventually be identified with the (potentially infinite) disjunctive physical state of, say, c-fiber excitation (in humans), d-fiber excitation (in mollusks), and e-network state (in a robot)" (Schneider, 2010?). But how could we possibly know what to include and exclude in this big disjunction?
Since it looks like every possible theory of consciousness either has to accept or reject counterfactual robustness, and there are absurd consequences either way, every theory of consciousness will have absurd consequences. So, it looks like consciousness is absurd.
What to do?
I'm still leaning towards some kind of token identity theory, since counterfactual robustness seems to just imply pretty obviously false predictions about experiences when you git rid of stuff that wouldn't have made any physical difference anyway, and substrationism will be the new speciecism, whereas tortured walls just seem very weird and unsettling. But maybe I'm just clinging to physical intuitions, and I should let them go and accept counterfactual robustness, and that getting rid of the unused neurons during torture can turn off the lights. "Spooky action at a distance" ended up being real, after all.
What do you think?
Helpful stuff I've read on this and was too lazy to cite properly
Counterfactual robustness
- "2.2 Is a Wall a Computer?" in https://www.nyu.edu/gsas/dept/philo/faculty/block/papers/msb.html
- "Objection 3" and "Counterfactuals Can't Count" in http://www.doc.gold.ac.uk/~mas02mb/Selected%20Papers/2004%20BICS.pdf
- https://reducing-suffering.org/which-computations-do-i-care-about/#Counterfactual_robustness
- "Objection 6: Mapping to reality" https://opentheory.net/2017/07/why-i-think-the-foundational-research-institute-should-rethink-its-approach/ (also on the EA Forum here)
Identity theory
This is a subject I've thought about a lot, so I'm pretty happy to have seen this post :).
I'm not convinced by counterfactual robustness either. For one, I don't think humans are very robust either, since we rely on a relatively specific environment to live. And where to draw the line between robust and non-robust seems arbitrary.
Plus, whether a person is counterfactually robust can be changed without modifying them, and only by modifying their surroundings. For example, if you could perfectly predict a person's actions, you could "trap" their environment, adding some hidden cameras that check that the person doesn't deviate from your predictions, and triggers a bomb if they do deviate. Then that person is no longer counterfactually robust, since any slight change will trigger the bomb and destroy them. But we didn't touch them at all, only some hidden surroundings!
---
I also suspect that we can't just bite the bullet about consciousness and Turing machines appearing everywhere, since I think it would have anthropic implications that don't match reality. Anthropic arguments are not on very solid footing, so I'm not totally confident about that, but nonetheless I think there's probably just something we don't understand yet.
I also think this absurdity you've noticed is an instance of a more general problem, since it applies to pretty much any emergent pattern. The same way you can find consciousness everywhere, you can find all sorts of Turing machines everywhere. So I view this as the problem of trying to characterize emergent phenomena.
---
Investigating causality was the lead I followed for a while as well, but every attempt I've made with it has ended up too strong, capable of seeing imaginary Turing machines everywhere. So lately I've been investigating the possibility that emergence might be about *information* in addition to causality.
One intuition I have for this is that the problem might happen because we add information in the process of pointing to the emergent phenomena. Given a bunch of particles randomly interacting with each other, you can probably point to a path of causality and make a correspondence to a person. But pointing out that path takes a lot of information which might only be present inside the pointer, so I think it's possible that we're effectively "sneaking in" the person via our pointer.
I often also use Conway's Game of Life when I think about this issue. In the Game of Life, bits are often encoded as the presence or absence of a glider. This means that causality has to be able to travel the void of dead cells, so that the absence of a glider can be causal. This gives a pretty good argument that every cell has some causal effect on its neighbours, even dead ones.
But if we allow that, we can suddenly draw effectively arbitrary causal arrows inside a completely dead board! So I don't think that can be right, either. My current lead for solving this is that the dead board has effectively no information; it's trivial to write a proof that every future cell is also dead. On the other hand, for a complex board, proving its future state can be very difficult and might require simulating every step. This seems to point to a difference in *informational* content, even in two places where we have similar causal arrows.
So my suspicion is that random interactions inside walls might not contain the right information to encode a person. Unfortunately I don't know much information theory yet, so my progress in figuring this out is slow.
The bomb trap example is very interesting! Can't be counterfactually robust if you're dead. Instead of bombs, we could also just use sudden overwhelming sensory inputs in the modality they're fastest in to interrupt other processing. However, one objection could be that there exist some counterfactuals (for the same unmodified brain) where the person does what they're supposed to. Objects we normally think of as unconscious don't even have this weaker kind of counterfactual robustness: they need to be altered into different systems to do what they're... (read more)