Is This Thing Sentient, Y/​N?

Builds up on: Consciousness and the Brain by Stanislas Dehaene. Good summaries may be found here or here, though reading them is not strictly necessary.

Synopsis: I claim to describe the exact mental structure that allows qualia.


Background: What Is Consciousness?

Dehaene’s Consciousness and the Brain rigorously differentiates conscious and unconscious activity. Consciousness, the book suggests, is correlated with events where the brain gathers all of its probability distributions about the world, and samples from them to build a consistent unitary world-model, on which it then acts. The experiments show that this is necessary for multi-step calculation, abstract thinking, and reasoning over agglomerations of distant sensory inputs.

However, the book’s definition of consciousness is not a synonym for self-awareness. Rather, I would term the phenomenon it picks out as “moments of agency”: as the state in which the mind can devise goal-oriented plans using a well-formed world-model, and engage in proper consequenialist reasoning. Outside those moments, it’s just a bundle of automatic heuristics.

Self-awareness, I suspect, is part of these moments-of-agency in humans, but isn’t the same thing generally. Just having Dehaene!consciousness isn’t a sufficient condition for self-awareness: there’s something on top of that going on.


What Is Self-Awareness?

What do we expect to happen to an agent the moment it attains self-awareness, in the sense of perceiving itself to have qualia?

Why, it would start perceiving qualia — the keyword being perceive. It would start acting like it receives some sort of feedback from a novel sense, not unlike sight. Getting data about how it’s like to be a thing like itself.

Let’s suppose that it’s nothing magical — that there isn’t a species of etheric parasites which attach themselves to any sufficiently advanced engine of cognition and start making it hallucinate. Neither are qualia “emergent” — as if, if you formally wrote out an algorithm for general reasoning, that algorithm would spontaneously rewrite itself to be having these imaginary experiences. If self-awareness is as mundane as any other sense, then what internal mechanism would we expect to correspond to it?

When we “see”, what happens is: some sort of ground-truth data enter a specialized sensory organ, that organ transmits the information to the brain, the brain parses it, and offers it to our conscious inspection[1], so we may account for it in planning.

If we view qualia as sense-data, it follows that they’d be processed along a similar pathway.

  • What are the ground-truth data corresponding to qualia? The current internal state of this agent. The inputs it’s processing, the configuration its world-model is in, its working-memory cache, the setting it’s operating in, the suite of currently active processes...

  • What is the specialized sensory organ, and how does it communicate with the brain? Despite what may seem, we do need one. The ground-truth state of the brain isn’t by default “known” by the brain itself; it’s just in that state. A specialized mechanism needs to know how to summarize raw brain-states into reports, then pool them together with other information about the world.

  • How are the qualia-data interpreted? Much like visual information, they’re parsed as the snapshot of all information received by the sensory organ at a particular moment. A self-model; a summary of how it’s like to be you, perceiving what you perceive and feeling what you feel.

    • (In theory, that should cause infinite recursion. A faithful self-model also has a self-model: you can consider what it’s like to be someone who experiences being a someone. But seeing as we’re bounded agents, I assume that algorithm is lazy.)

Let’s make the distinction sharper. An agent gets hurt, information about that travels to the brain, where it’s interpreted as “pain”. Pain has the following effects:

  1. It updates the inner planner away from plans that cause the agent harm, like an NN getting high loss.

  2. It changes the current plan-making regime: the planner is incentivized to make plans in a hurried manner, and to consider more extreme options, so as to get out of the dangerous situation faster.

Which parts of that correspond to pain-qualia?

None.

Pain-qualia are not pain inflicting changes upon a planner: by themselves, these changes are introduced outside the planner’s purview. Consider taking an inert neural network, manually rewriting its weights, then running it. It couldn’t have possibly “felt” that change, except by magic; it was simply changed. For a change to be felt, you need an outer loop that’d record your actions, then present the records to the NN.

That’s what pain-qualia are: summaries of the effects pain has upon the planner that are fed as input to that very planner.

That leaves one question:

How Is It Useful?

Well, self-awareness evolved to be offered as an input to the planner-part, so it must be used by the planner-part somehow. The obvious answer seems correct here: meta-planning.

First, self-awareness allows an agent to account for altered states of consciousness. If it knows it’s deliriously happy, or sad, or drugged, it’ll know that it’s biased towards certain actions or plans over others, and that these situational biases may not be desirable. So it’ll know to correct its behavior to suit. (Note that mere awareness of an altered state is insufficient: feedback needs to be detailed enough to allow that course-correction.)

Second, it allows it to predict in detail its future plan-making instances. What, given a plan, its future selves would want to do at various stages of that plan, and how capable they’ll be of doing this.

Concretely, self-awareness is what allows to:

  • Know not to lash out while angry, even if it feels right and sensible in the moment, because you know your plan-making process is compromised.

  • Know that a plan which hinges on your ability to solve highly challenging mathematical problems while getting your arm chopped off is a doomed one.

  • Know not to commit to an exciting-seeming project too soon, because you know from past experience that your interest will wane.

To be clear, those are just examples; what we want is the ability to display such meta-planning universally. We can imagine an animal that instinctively shies away from certain drastic actions while angry, but the actual requirement is the ability to do that in off-distribution contexts in a zero-shot regime.

On that note… I’ll abstain from strong statements on whether various animals actually have self-models complex enough to be morally relevant. I suspect, however, that almost no-one’s planning algorithms are advanced enough to make good use of qualia — and evolution would not grant them senses they can’t use. In particular, this capability implies high trust placed by evolution in the planner-part: that sometimes it may know better than the built-in instincts, and should have the ability to plan around them.

But I’m pushing back against this sort of argument. As I’ve described, a mind in pain does not necessarily experience that pain. The capacity to have qualia of pain corresponds to a specific mental process where the effect of pain on the agent is picked up by a specialized “sensory apparatus” and re-fed as input to the planning module within that agent. This, on a very concrete level, is what having internal experience means. Just track the information flows!

And it’s entirely possible for a mind to simply lack that sensory apparatus.

As such, in terms of empirical tests for sentience, the thing to look for isn’t whether something looks like it experiences emotions. It’s whether, while plan-making, that agent can reason about its own behavior in different emotional states.


Miscellanea

1. Cogito, Ergo Sum. It’s easy to formalize. As per Dehaene, all of the inferences the brain makes about the outside world are probabilistic. When presented to the planner, they would be appropriately tagged with their probability estimates. The one exception would be information about the brain’s own continued functioning: it would be tagged “confidence 1”. After all, the only way for the self-awareness mechanism to become compromised involves severe damage to the brain’s internals, which is probably fatal. So evolution never had cause to program us to doubt it.

And that’s why we go around slipping into solipsism.

2. “Here’s a simple program that tracks its own state. Is it sentient?” No. It needs to be an agent.

3. The Hard Problem of Consciousness. None of the above seems to address the real question: why does self-awareness seem so… metaphysically different from the rest of the universe? Or, phrased more tractably: “Why does a mind that implements the self-awareness mechanism start viewing self-aware processes as being qualitatively different, compared to other matter? And gets so confused about it?”

I’m afraid I don’t have a complete answer to that, as I’m having some trouble staring at the thing myself. I feel confident, though, that whatever it is, it wouldn’t invalidate anything I wrote above.[2] I suspect it’s a combination of two things:

  • Cogito, ergo sum. Our existence feels qualitatively different because it’s the only thing to which the mind assigns absolute confidence.

  • A quirk of our conceptual vocabulary. It’s not that self-aware things have some metaphysically special component, it’s that there’s an irreducible concept native to our minds that sharply differentiates them, so things we imagine as sentient feel qualitatively different to us.

And this “qualia” concept has some really confusing properties. It’s basically defined by “this thing has first-person experiences, just like me”. Yet we can imagine rocks to have qualia, and rocks definitely don’t share any of our algorithmic machinery, especially the self-awareness machinery. At the same time, we can imagine entities that share all of our algorithms — p-zombies — which somehow lack qualia.

Why is that so? What is the purpose of this distinct “qualia” concept? Why are we allowed to use it so incoherently?

I’m not sure, but I suspect that it’s a short-hand for “has inherent moral relevance”. It’s not tied to “is self-aware” because evolution wanted us to be able to dehumanize criminals and members of competing tribes: view them as beasts, soulless barbarians. So the concept is decoupled from its definition, which means we can imagine incoherent states where things that have what we define as “qualia” don’t have qualia, and vice versa.

It’s not a particularly satisfying answer, granted. But what’s the alternative here? If this magical-feeling “first-person perspective” can act on the world, it has to be implemented within the world. The self-awareness as I described it seems to suffice to describe all characteristics of having qualia, and the purpose of our capability for self-awareness; it just fails to validate our feeling that qualia are metaphysically significant. But if we can’t accept “they’re not metaphysically significant, here’s how they’re implemented and why they falsely feel metaphysically significant”, then we’ve decided to reject any answer except “here’s a new metaphysics, turns out there really are etheric parasites making us hallucinate!”.

Maybe my specific answer, “qualia-having is a signifier for moral relevance”, isn’t exactly right; and indeed, it doesn’t quite feel right to me. But whatever the real answer is, I expect it to look very similar, and to similarly dismiss the “hard problem” as lacking substance.

4. Moral Implications. If you accept the above, then the whole “qualia” debacle is just a massive red herring caused by the idiosyncrasies of our mental architecture. What does that imply for ethics?

Well, that’s simple: we just have to re-connect the free-floating “qualia” concept with the definition of qualia. We value things that have first-person experiences similar to ours. Hence, we have to isolate the algorithms that allow things to have first-person experiences like ours, then assign things like that moral relevance, and dismiss the moral relevance of everything else.

And with there not being some additional “magical fluid” that can confer moral relevance to a bundle of matter, we can rest assured there won’t be any shocking twists where puddles turn out to have been important this entire time.

  1. ^

    In the Dehaene sense: i. e., the data are pooled together and consolidated into a world-model which is fed as input to the planning algorithm.

  2. ^

    Unless it really are etheric parasites or something. I mean, it might be!