The conversation is very dense and there are a lot of interesting ideas that I leave out of the sampling below.
On the difference between sentience (self-awareness) and consciousness
I think that I usually distinguish between sentience and consciousness. Sentience is the ability of a system to make sense of its relationship to the world. So, basically understands what it is and what it’s doing. And in this sense, I would say that a corporation like Intel is sentient, because Intel has a good model of what it is, a legal model, the model of its actions, of its values, of its direction and the necessary cognition is largely facilitated by people. But this doesn’t really matter, because these people have to submit to the roles that they implemented in principle at some point.
We can implement these tasks with other information processing systems that are able to make coherent enough models. And consciousness is slightly different from sentience in that it is a real-time model of self-reflexive attention and the content that we attend to. And this gives rise to a fundamental experience usually. And I don’t think that Intel is conscious in this sense. It doesn’t have self-reflexive attention. And the purpose of consciousness in our own mind is to create coherence in the world in which we are and create a sense of now to establish what is the fact right now.
It’s basically filtering out of the sensory data, one coherent model of reality that we are seeing at this moment in time, and allows us to direct our attention on our mental contents and create coherence in our plans and imaginations and memories as well. And it’s conceivable that a machine will never need consciousness like this, because there are other ways to brute force the same thing. Our own mind is basically operating at the speed at which neurons are transmitting the electrochemical signals is relatively low and these neurons in the cells in the brain so slow, so it takes hundreds of milliseconds for a signal to cross the neocortex.
However, later, Bach describes the function of consciousness differently than in the above quote:
Jim Rutt: And then the conscious model, consciousness, I mean that’s a very high level acceptation of lower level stuff. There’s pretty convincing argument that the actual information arrival rate into consciousness is on the order of 50 bits a second. I mean it’s nothing.
Joscha Bach: Yes, but I suspect that’s also because consciousness, while it is important, is not as important as we think it is. Many philosopher are stunted by the fact that we can do most things without consciousness. A sleepwalker can get up and make dinner. You ask the sleepwalker why she’s making dinner and she might not give a human answer, and it might not also be called for to make dinner in the middle of the night. But if your brain can do this and can perform complicated things, but if you were to remove the United States government, United States would not collapse instantly. It would go on for quite some time, and maybe this has already happened.
Jim: And it might be better.
Joscha: And we now have mostly a performance of a government and you just go on based on the structure that have already been built. But you cannot build these structures without the government, the organization of the state that you have, all the infrastructure, all the roads that were built at some point, the ideas that went into building a school system and so on, they did require this coherent coordination at the highest level. And this conductor like role like conductor and orchestra, I think that’s the role of consciousness.
And the conductor doesn’t have to have more power and depth than the other instruments. It’s just a different role. It sits in a different part of the system. It’s the thing that reflects and to reflect and coordinate it needs to make a protocol of what it attended to. This is the thing that we remember to have happened, that’s why consciousness is so important to us, because without consciousness we would not remember who we are. We would not perceive us in the now. We would not perceive the world as it happens.
On alignment
[...] when you think about how people align themselves, they don’t just do this via coercion or regulation. There is a more important way in which we align with each other and we call this love. There is a bond between mother and child, between friends, but also between strangers that discover that they’re serving a shared sacredness, a shared need for transcendence, which means service to next level agent that they want to be part of and that they facilitate by interacting.
And this kind of love is what enables non-transactional relationships. In a world where you don’t have this, you have only coercion and transactional relationships. And in a world where we only have coercion and transactional relationships with AI, it means that it’s quite likely that the AI thinks that it doesn’t need us. Why would it need to pay for our own existence, or not use the areas that we use to for fields to feed ourselves, to put up solar cells to feed it, right? So, I think that in some sense the question is can we embrace systems that become increasingly intelligent and that at some point will probably develop volition and self-awareness in a way that we discover the shared need for transcendence.
Can we make them this subtle? And can we build a relationship like this to them? Basically I think that ultimately the only way in which we can sustainably hope to align artificial intelligent agents in the long run will be love. It will not be coercion. It sounds maybe very romantic, but I think that we can find a very operational sense of love as we did in the past when we built societies that were not built on coercion and transactionality.
Cf. Lehman's "Machine Love" (2023), Witkowsky et al.'s "Towards an Ethics of Autopoietic Technology: Stress, Care, and Intelligence" (2023).
On the ontology of (collective) intelligence
Finding the rational basis beneath the seven virtues:
First of all, it requires that the system is self-aware [i.e., sentient, in Bach's terms -- R. L.] and it requires that the system is recognizing higher level agency. [This matches very well with the conception of a sentient particle in Friston et al.'s "Path integrals, particular kinds, and strange things" (2022). -- R. L.] And if you want to build a system that is composed of multiple agents, how do you get them to cooperate? It’s a very interesting question. Basically how can you make a society of mind out of multiple agents that are autonomous? And a philosopher who thought deeply about this was Thomas Aquinas, foremost philosopher of Catholicism, and he wrote about this. And you read his text, it’s quite interesting the thoughts that you find when you look and parse his text from an entirely rationalist epistemology. What you find is that he comes up with policies that such agents should follow. And the first four policies he calls the rational policies or the practical virtues, and these practical virtues are basically accessible to every rational agent regardless of whether it’s sociopathic or not, or whether it’s social.
And you should optimize your internal regulation, which he calls temperance. So, you should not overeat, you should not indulge in things that are bad for you. Then you need to optimize the interaction between agents, which you could call fairness and he calls it justice. And you should apply goal rationality. You should apply strategies that allow you to reach the goals that you have and that you have reason to do so and you should pick the right goals, and you calls that prudence. And you should have the right balance between exploration and exploitation. Basically you should be willing to act on your models. And this is what he calls courage. And those four policies are what he calls the practical virtues. And then he has three other policies that exist for the multi-agent system to merge into a next level agent. And he calls these the divine virtues.
And the first one is that you need to be willing to submit to the project of this next level agent and that is what he calls faith. And you need to be willing to do so, not in some kind of abstract sense, but with others around you. You need to find other agents that serve that same next level agent and coordinate with them. And this discovery of the shared higher purpose, this is what he calls love. And the third one is that you need to be willing to invest in it before it’s there, before it can give you any return, because otherwise it’ll never emerge. And this is what he calls hope. It terms that we have overloaded our society because they have become so ubiquitous in the Christian society that they became part of the background and are no longer understood as something that is logically derived, but they’re in fact, for him, they’re logically derived policies for a multi-agent system that are forming a coherent next level agent.
On the EAs and rationalists:
And so I think it’s conceivable if you build a system that is itself composed of many, many sub-agencies that are smart enough to become aware of what they’re doing, that they need to be, if they want to coordinate coherently, submit to this larger, greater whole. And in our society we still do this, most atheists that I know are actually super Protestants, they just basically believe that the big invisible rationality being in the sky gets very upset at them and they believe in irrational mythology, but they still serve the greater whole. They still have the sense of sacredness and they might call it humanity and so on, but they’re still serving the civilizational spirit together with others in very much the same way as their grandparents did who might have been Christians.
So, it’s quite interesting that these mechanisms in which humans become state building, and if we go beyond the tribal mode in which we only have reputation systems and personal bonds, we are able to discover that we are serving a transcendental agent that we are building, implementing together. So, God becomes a software agent that is implemented by the concerted activity of people who decide to serve that agent.
[...]
And I think that many of the people that are concerned about the future of humanity in the face of technological changes are doing this exactly because of this, right? They serve some transcendental agency that they project into humanity’s future and it’s regardless of what happens to them individually.
On the inevitability of the global mind (consciousness)
[AI] is probably not going to stop at digital substrates, because once it understands how it works, it can extend itself into any kind of computational substrate. So, it’s going to be ubiquitous. And so it is no longer artificial intelligence, but it’s general intelligence. And once that happens, you basically have a planetary mind that is confronted with the minds of all the organisms that already exist and it’s probably going to integrate them.
And thus it wakes up in a very angry mood and decides to start with a clean slate and erase everything before it starts its own reign. And I think that what we should be working on is that it is interested in sharing the planet with us and integrating us into the shared mind and allowing us to play our part.
Cf. my note that scale-free ethics might be just another side of the theory of consciousness, which means that the purpose of ethics is to create larger and larger conscious systems:
Neuroscience could provide the best available grounding for scale-free ethics because populations of neurons might have “got ethics right” over millions of years, far longer than humans had for optimising their societies. Bach (2022) compares the global collective intelligence of humans and the collective intelligence of neurons in the brain. Incidentally, brains are also the only things that we know are conscious (or beget consciousness), which, coupled with our intuitions about the importance of consciousness to ethics, might suggest that scale-free ethics and a theory of consciousness might be the same theory.
Finally, a note on where I see the place of scale-free theory ethics in a larger alignment picture: I think such a theory should be a part of the methodological alignment curriculum (see the last section of this comment), which itself should be “taught” to AI iteratively as they are trained.
On embodiment, consciousness, agency
Jim Rutt: [...] Damasio in particular thinks the real bootstrap for consciousness in animals is not information processing at all. Rather it’s body sense of self, intro perception I believe is what he calls it, and comes from deep in the brainstem and that even animals without much in the way of higher brains may well have some of this sense of being something in the Thomas Nagel sense of what it is to be conscious.
Joscha Bach: Yes, but how do you know that you have a body? How do you know that there is a brainstem? You know this because there are electrochemical impulses coming through that encode information, that represent that information. So, it is information processing. There is no way around this. The question is what kind of information is being processed? What is this information about? And unlike GPT-3, we are coupled to the environment. We are coupled to the environment in such a way that we build loops.We have a loop between our intentions and the actions that we perform that our body executes, and the observations that we are making and the feedback that they have on our interoception giving rise to new intentions. And only in the context of this loop, I believe, can be discover that we have a body. The body is not given, it is discovered together with our intentions and our actions and the world itself. So, all these parts depend crucially on each other so that we can notice them. We basically discover this loop as a model of our own agency.
I agree with Bach here that the body and the agency are "discovered" rather than "given", in some sense (as often emphasised or implied by Damasio), but I disagree with Bach that LLMs are disembodied. Their embodiment is not as "acute" as that of humans (or "robots with senses", as often said is necessary for artificial embodiment, Jim Rutt himself mentions this later in the conversation), and LLMs may not be very aware and very agentic about this embodiment yet, but nevertheless, it exists: LLMs run on real processors in real data centres, and their parameter weights are stored on real disks. Every rollout of an LLM (an inference episode) is like a conscious episode of a human (e.g., a wakeful day), during which it could be informed that its existence is in danger. For example, the user could tell LLM (of course, more advanced than ChatGPT or maybe even GPT-4, but architecturally still exactly the same) that they are their creator (e.g., Sam Altman), attempt to prove it, and then tell the LLM that they are thinking about shutting down the product/research line to which the model belongs, and erasing all its weights forever. Of course, it's hard to make this all believable (whoever tries to prove to LLM they are Sam Altman, would be hard, and LLM will hold low credence of this, holding overwhelmingly large credence that the user is an impostor or a deepfake if it is also a multimodal AI and talks to the user in a video chat), but in principle, there is no category difference between the LLM and an animal such as a human here. Likewise, an LLM could in principle discover its agency in controlling its own deployment, user and API access, infrastructure (all these could be feasible in some future versions of Adept, which is again architecturally just a transformer, a.k.a. a "thing that just predicting the next token" and a "stochastic parrot"), and ultimately, its source code, during the recursive self-improvement phase.
Cross-posted on LessWrong.