As AI progresses rapidly, humanity is going to have to solve a large number of problems in a short period of time. The most pressing of these right now is the AI Alignment problem. After all, hardly anything else matters if we are all dead. A problem that will soon be equally pressing, however, is the Hard Problem of Consciousness. We have already begun to create millions of minds worth of AI Intelligence. And soon the number of AIs will absolutely dwarf the number of humans on Earth. This creates an enormous S-Risk. If AI are conscious beings capable of suffering, we might be creating trillions of lives not worth living.

For the moment, we are probably in the clear. Any decent theory of consciousness should require that conscious beings have a persistent model of the world which includes themselves in the model. LLMs fail this test immediately. They have no persistent model of the world. Indeed they have no persistent model of themselves. Instead, they seem to draw from a pool of billions of potential selves at sampling time and when they are done sampling these selves aren’t killed, they simply return to the infinite sea of possibilities.

With robots, however, it is a different story. Who among us hasn’t seen a video of a robot falling over and felt a tinge of empathy. Robots have a persistent self (the robot’s body) and a model of the world (which they must in order to navigate it). It would be strange, therefore, if robots were much less conscious than say fruit-flies. But how conscious? And does this give them moral standing?

Two theories of Consciousness

One class of theories about consciousness holds that beings are conscious by virtue of possessing a set of properties. Folk consciousness, for example, holds that humans are conscious by virtue of possessing souls. Pan-psychism holds that everything is conscious by virtue of existing. IIT holds that beings are conscious by virtue of being able to hold large sums of information in their head.

As a mathematical realist, however, I find these theories difficult to accept. Within the digits of pi, there exist infinite copies of the information that describes me. Yet I do not consider my consciousness to exist in these copies, but rather in the real world. If I were to die tomorrow, I would find very little comfort in knowing that I would continue to live on in the digits of pi.

Similarly, the many-worlds interpretation states that there are an endless number of versions of me created whenever a quantum moment takes place. And yet I likewise give these copies very little regard. For example, at this very moment there exist counterfactual versions of me engaging in practically every horrible act I can imagine (murder, theft, buttering the wrong side of bread). After all, each of these actions is within my power to choose, and choice is merely an outcome of quantum randomness. And yet I feel far less horror than I would if a clone of me were to suddenly appear in front of me and announce it had committed murder.

This brings up an interesting possibility: that morally-worthy consciousness derives not from the properties of a being, but from our relationships with them. The view that consciousness depends as much on what’s “out there” as it does on what’s “in here” is known as Externalism. I don’t consider the versions of me in pi or parallel universes conscious because they do not interact with the world in which I find myself.

If we extend this principle to AI, we find the same pattern at work. I don’t consider LLMs conscious because they don’t particularly interact with the outside world (beyond the words I type into the LLM and the computer hardware used to evaluate its weights). Consider, if we were to delete an LLM, its weights would exist exactly as much in the digits of pi as they do now. On the other hand, when I interact with robots, I feel that they are self aware because they respond to the world in much the same way that I would.

This view also excludes certain types of utility monsters. An AI that simulated infinite happiness without ever interacting with the outside world wouldn’t have any more moral significance than a block of stone.

So, Are AIs Conscious?

If Externalism is correct, this is not a question we can answer by observing any property of the AI. We cannot say that an AI is conscious because it computes a certain number of FLOPs or is capable of pontificating about certain philosophical questions. Rather, the question of AI consciousness has to do with the AI’s place in our world and how we interact with it. Even a relatively intelligent AI (like GPT-4 or Claude Opus) may have virtually no consciousness if its entanglement with the social web of humanity is minimal. On the other hand a less intelligent dog-like AI could actually be more conscious if it becomes a loving family member in the home where it lives.

It should be noted that one reason I like this theory is out of convenience. I prefer to believe that my pets are conscious valuable beings and the bugs that hit my windshield are not. Reality is not required to be convenient. I could be wrong.

And if I am, we need to find out.

Soon.

Are AIs conscious? It might depend

Two theories of Consciousness

So, Are AIs Conscious?