Hide table of contents

Summary: I apply some philosophy of mind to some common mistakes people make about (possible) AI minds. In our current situation (where we are very ignorant of the relationships between the following things), take care to talk about problem-solving ability, goal-orientedness, phenomenal consciousness, self-consciousness, and moral patiency separately, and to watch out for others eliding these concepts under the single word “sentience”.


Epistemic status: Written quickly. But I did write my PhD on the philosophy of consciousness, so I have subject-matter expertise to draw on in the judgments I express here. 


Thanks to my Arb colleague Gavin Leech for heavily revising this in a way that greatly improved style/readability



One thing I see a lot in the media is discussions about when AI will become “sentient”. Unfortunately, these discussions are often quite misleading:

  1. If you read them casually, you’d come away confused about what the word “sentient” means. 
  2. You could also come away with the impression that being sentient is closely connected to being very smart and capable. It’s unclear how true this is: I find it plausible (maybe 60% likely) that a non-sentient AI could still be very smart and agentic. I’m also unsure whether training a very smart AI gets you a conscious one by default (maybe 70% that you’d get one by default, and 30% that you wouldn’t). And I think it is very likely (>90%) that a not-very-capable AI could be sentient. 


Confusion about “sentient” 


Here’s a definition of sentience (found by googling ‘are lobsters sentient’) which matches what I thought was common knowledge:

Sentience is the ability to have feelings, such as pain, distress, or comfort.

However, popular articles about AI, and the putative experts quoted within them, often use “sentience” in very different ways:

a) ‘Lemoine says that before he went to the press, he tried to work with Google to begin tackling this question – he proposed various experiments that he wanted to run. He thinks sentience is predicated on the ability to be a “self-reflective storyteller”, therefore he argues a crocodile is conscious but not sentient because it doesn’t have “the part of you that thinks about thinking about you thinking about you”.’

b) The AI community has historically been swift to dismiss claims (including Lemoine’s) that suggest AI is self-aware [my italics], arguing it can distract from interrogating bigger issues like ethics and bias.’ 

c) ‘Moral outsourcing, she says, applies the logic of sentience [my italics] and choice to AI, allowing technologists to effectively reallocate responsibility for the products they build onto the products themselves…’

The first two passages here give you the impression “sentience” means ‘having a self-concept’ or ‘being able to identify your own sensations’. But this is not obviously the same thing as being able to undergo pains, pleasures and other sensations (phenomenal consciousness). The idea that chickens, for example, have pleasures and pains and other sensations seems at first glance to be compatible with them lacking the capacity to think about themselves or their own mental states. In the third passage meanwhile, it’s not entirely clear what “sentience” is being used to mean (moral agency and responsibility?), but it’s not ‘feels pleasures, pains and other sensations’. 

“Sentience” as the ability to have pleasures, pains and other sensations is (I claim) the standard usage. But even if I’m wrong, the important thing is for people to make clear what they mean when they say “sentient”, to avoid confusion. 

Philosophers of mind usually use the technical term “phenomenal consciousness” to pick out sentience in the “can experience pleasures, pains and other sensations’ sense. Lots of information processing in the brain is not conscious in this sense. For example, during vision, the information encoded in the retinal image goes through heavy preprocessing you are not aware of, and which isn’t conscious. 

In fairness, one tradition in philosophy of consciousness holds that the capacity to be aware of your own current experience is a necessary condition of being sentient at all,
so-called ‘higher-order’ theories of phenomenal consciousness and sentience. On these theories, what makes a process phenomenally conscious is that the subject is aware that they are having an experience which contains that information. So, for example, what makes a visual representation of red in the brain phenomenally conscious, is that you’re aware you’re having an experience of red. On this view, only agents who are self-aware can have conscious experiences. So this view implies that only agents who can be self-aware can be sentient in the ‘experiences sensations, such as pleasure and pain’ sense.  

But whilst this view has some academic endorsement, it is anything like a consensus view. So we shouldn’t talk as if being sentient required self-awareness, without clarifying that this is a disputed matter.

Sentience as intelligence


Reading newspaper articles on AI, you could also easily come away with the impression that being sentient is closely connected to being smart and capable. In fact, it’s unclear how true this is: I find it plausible (maybe 60% likely) that a non-sentient AI could still be a very smart and agentic. I’m also unsure whether training a very smart AI gets you a conscious one by default (maybe 70% that you’d get one by default, and 30% that you wouldn’t). And I think it is very likely (>90%) that a not-very-capable AI could in principle be sentient. 

Last year, the New York Times published an article arguing that current AIs are not sentient and speculating about why some people in tech (allegedly) mistakenly think that they are. However, the discussion in the article ranges beyond sentience itself, to the idea that AI researchers have a long history of overhyping the capabilities of AIs:

‘The pioneers of the field aimed to recreate human intelligence by any technological means necessary, and they were confident this would not take very long. Some said a machine would beat the world chess champion and discover its own mathematical theorem within the next decade. That did not happen, either.

The research produced some notable technologies, but they were nowhere close to reproducing human intelligence.’ 

The article goes on to discuss worries about AI x-risk, in a context where that arguably at least implies this is part of the same overhyping of AI capabilities:

‘In the early 2000s, members of a sprawling online community — now called Rationalists or Effective Altruists — began exploring the possibility that artificial intelligence would one day destroy the world. Soon, they pushed this long-term philosophy into academia and industry.

Inside today’s leading A.I. labs, stills and posters from classic science fiction films hang on the conference room walls. As researchers chase these tropes, they use the same aspirational language used by Dr. Rosenblatt and the other pioneers.

Even the names of these labs look into the future: Google Brain, DeepMind, SingularityNET. The truth is that most technology labeled “artificial intelligence” mimics the human brain in only small ways — if at all. Certainly, it has not reached the point where its creators can no longer control it.’

Whilst the article itself does not actually make this claim, reading it could easily give you the impression that there is a tight link between an AI being sentient, and it displaying impressive intellectual capabilities, agency, or the ability to takeover the world. 

In fact however, it’s not obvious why there couldn’t be a highly capable – even overall superhuman –  AI that was not sentient at all. And it’s very likely that even some fairly unimpressive AIs that are a lot less capable than human beings could nonetheless be sentient. 


Why it’s plausible that a non-sentient AI could still be very smart and agentic


To be smart, an agent just has to be good at taking in information, and coming up with correct answers to questions. To be capable, an agent just has to be both good at coming up with plans (a special case of answering questions correctly) and able to execute those plans.

We know that information processing can be done unconsciously: plenty of unconscious processing goes on in the brain. So if you want to argue that performing some particular cognitive task requires sentience, you need to explain why unconscious information processing, despite being possible, can’t accomplish that particular task. Likewise, if you want to argue that training an AI to be good at a particular task will probably produce a conscious AI, you need an argument for why it's easier to find a conscious than unconscious way of performing that task.  

It would be nice if we could a) look at what “the science” says about when information processing is conscious, and then b) check for particularly impressive intellectual tasks whether they could possibly/plausibly be performed without doing the sort of processing the theory says is conscious. If we could do this, we could figure out whether an AI could plausibly do the cognitive tasks needed for something big – automating science, or perform all current white-collar labour, or attempt to take over the world – without being conscious. 

Unfortunately, we can’t really do that as – steel yourself – people in cognitive science and philosophy don’t agree on what makes some information-processing conscious rather than unconscious. A huge variety of academic theories of consciousness get taken seriously by some decent number of philosophers and cognitive scientists. And those theories often imply very different things about which sort or amount of processing is conscious. 

(This is not the only problem: it’s also likely to be a very hard question what sort of processing you need to do to be able to accomplish very broad tasks like ‘automate science’, or indeed, much more specific tasks. We don’t actually understand everything about how human vision works, for example, let alone all the alternative ways tasks currently performed by our visual system could be accomplished.)

What I can say is: it's not clear to me that any theory of consciousness I know of implies that a very smart AI, capable of doing the vast majority of reasoning human beings perform, would have to be conscious.

Global Workspace theories


According to these theories, consciousness works like this: 
A mind has lots and lots of different subsystems which process information: for example, they work out what you’re seeing based on features of the retinal image, or they help you choose what action to perform next. But they tend to do their task without checking what’s going on in the other subsystems. Sometimes, however, those subsystems broadcast the result of their information-processing to a “global workspace. And all the subsystems do pay attention to information that reaches the global workspace, and make use of it in their own reasoning. When information is broadcast to the global workspace, there’s a corresponding conscious experience. But no conscious experience occurs when the subsystems process information internally.

Clearly, some kind of argument would be needed before we can conclude that an AI as intelligent as a human, or which is capable of automating science, or which is superintelligent, would have to have this sort of separate subsystems-plus-global workspace architecture.

Maybe it’ll turn out that training AIs to do complicated tasks just naturally results in them being organized in a global workspace plus subsystems way. But even if global workspace theorists turn out to be right that this is how the human mind is organized, its emergence in AI trained to do complex tasks is not guaranteed.


Higher-order theories 


Explained above. On higher-order theories, for information-processing to give rise to conscious experience, the subject needs to be in some sense aware of the processing. In other words, the reason, for example, that your visual experience of red is conscious, and not unconscious like much processing of visual sensory input, is that the experiences cause you to realize that it’s an experience of red. 

Clearly, an AI could perform a lot of complicated and impressive thinking without needing to be aware of any particular processing within itself and classify them as conscious experiences of a certain kind. So the mere fact that an AI is smart, and impressive, and super-human in many domains is not evidence that it counts as conscious by the lights of higher-order theories. On the other hand, it is somewhat plausible that in order to be able to “takeover the world” an AI would need to be self-aware in some sense. It’s hard to engage in sophisticated long-run planning about the results of your actions, if you don’t understand basic facts about what and where you are. 

But ‘having accurate beliefs about yourself’ and ‘being able to recognize some internal information-processing episodes as they occur’ are not the same thing. So in order to get from ‘AI must be self-aware to be a takeover risk’ to ‘AI must meet the higher-order theory criteria for being conscious to be a takeover risk’, we’d need some kind of argument that you can’t have the former without the latter. This looks non-trivial to me. 

These are not by any means the only theories of consciousness that are taken seriously in cognitive science. But I strongly doubt that many of the other theories taken seriously make it completely obvious that a human-level, or existentially dangerous, or super-human AI would have to be conscious, and it’s not clear to me how many even suggest that it would be more likely to be conscious than not. 

Relatively unintelligent AIs could (in principle) be sentient


Many people seem to think that building a conscious AI would be super-hard, and something that would probably only happen if the AI was able to do things we find intuitively intellectually impressive, or human-like. I am skeptical: many animals, including for example, chickens are widely perceived as conscious/sentient (that’s chiefly why people object to factory farming them!), and yet are not particularly intellectually impressive or human-like. If chickens can manage this, why not AIs? 

(I don’t have a strong sense of which current academic theories of consciousness are or are not compatible with chickens being conscious, so it’s possible some might rule it out.)

It might well be possible to construct a conscious AI right now. We could almost certainly construct an AI with a global workspace architecture if we really wanted to now, and global workspace theory implies that such an AI would be conscious. I am currently unsure whether we could construct an AI that counts as conscious by the standards of the best higher-order theories. (This seems a very complex question to me that I would like to see serious philosophical work on.)  But “unsure” is a weak argument for “it’s very unlikely we can”. 

More generally, we currently don’t really know what consciousness/sentience is, and probably no one knows every possible way of constructing AIs. So current knowledge doesn’t seem to rule out, or even provide much direct evidence against, the idea that there are sentient AIs in the set of constructible AIs. 

Meanwhile, the fact that it’s at least plausible that relatively unsophisticated animals, like chickens, are sentient, suggests we shouldn’t place a particularly low prior on the idea that it is currently within our capacity to construct a sentient AI, just because we clear do not yet have the ability to construct AIs as generally intelligent as humans. People should avoid saying things which imply that AI consciousness would be a very difficult and super-impressive achievement that we are currently far from reaching. 


Sorted by Click to highlight new comments since:

Thank you for the post! 

I just want to add some pointers to the literature which also add to the uncertainty regarding whether current or near-future AI may be conscious: 

VanRullen & Kanai have made reasonably concrete suggestions on how deep learning networks could implement a form of global workspace: https://www.sciencedirect.com/science/article/abs/pii/S0166223621000771 

Moreover, the so-called "small network" or "trivial realization" argument suggests that most computational theories of consciousness can be implemented by very simple neural networks which are easy to build today: https://www.sciencedirect.com/science/article/abs/pii/S0893608007001530?via%3Dihub


Thanks for clarifying!

It seems like it would be good if the discussion moved from the binary-like question "is this AI system sentient?" to the spectrum-like question "what is the expected welfare range of this AI system?". I would say any system has a positive expected welfare range, because welfare ranges cannot be negative, and we cannot be 100 % sure they are null. If one interprets sentience as having a positive expected welfare range, AI systems are already sentient, and so the question is how much.

I think something like this is right, but I'm not entirely sure what an expected welfare range is. Suppose I think that all conscious  things with pleasurable/painful experiences have the same welfare range, but there is only a 1 in 1000 chance that a particular AI systems has conscious pains and pleasures. What would it's expected welfare range be? 

The expected welfare range can be calculated from "probability of welfare range being positive"*"expected welfare range if it is positive", and is usually assumed to be 1 for humans. So it would be 10^-4 for the case you described, i.e. having 10 k such AI systems experiencing the best possible state instead of the worst would produce as much welfare as having 1 human experiencing the best possible state instead of the worst.

Okay, that makes sense I guess. 

Curated and popular this week
Relevant opportunities