I Really Don’t Understand Eliezer Yudkowsky’s Position on Consciousness

From Twitter:

I’d say that I “don’t understand” why the people who worry that chickens are sentient and suffering, don’t also worry that GPT-3 is sentient and maybe suffering; but in fact I do understand, it’s just not a charitable understanding. Anyway, they’re both unsentient so no worries.

His overall thesis is spelt out in full here but I think the key passages are these ones:

What my model says is that when we have a cognitively reflective, self-modely thing, we can put very simple algorithms on top of that — as simple as a neural network having its weights adjusted — and that will feel like something, there will be something that it is like that thing to be, because there will be something self-modely enough to feel like there’s a thing happening to the person-that-is-this-person.

So I would be very averse to anyone producing pain in a newborn baby, even though I’d be truly shocked (like, fairies-in-the-garden shocked) to find them sentient, because I worry that might lose utility in future sentient-moments later.

I’m not totally sure people in sufficiently unreflective flow-like states are conscious, and I give serious consideration to the proposition that I am reflective enough for consciousness only during the moments I happen to wonder whether I am conscious.

I’m currently very confident on the following things, and I’m pretty sure EY is too:

  1. Consciousness (having qualia) exists and humans have it

  2. Consciousness isn’t an epiphenomenon

  3. Consciousness is a result of how information is processed in an algorithm, in the most general sense: a simulation of a human brain is just as conscious as a meat-human

EY’s position seems to be that self-modelling is both necessary and sufficient for consciousness. But I don’t ever see him putting forward a highly concrete thesis for why this is the case. He is correct that his model has more moving parts than other models. But having more moving parts only makes sense if it’s actually good at explaining observed data. And we only have one datapoint, which is that adult humans are conscious. Or do we?

“Higher” Consciousness

We actually have a few datapoints here. An ordering of consciousness as reported by humans might be:

Asleep Human < Awake Human < Human on Psychedelics/​Zen Meditation

I don’t know if EY agrees with this. From his beliefs he might say something along the lines of “having more thoughts doesn’t mean you’re more conscious”. Given his arguments about babies, I’m pretty sure he thinks that you can have memories of times when you weren’t conscious, and then consciously experience those things in a sort of “second hand” way by loading up those memories.

Now a lot of Zen meditation involves focusing on your own experiences, which seems like self-modelling. However something else I notice here is the common experience of “ego death” while using psychedelics and in types of meditation. Perhaps EY has a strong argument that this in fact requires more self-modelling than previous states. On the other hand, he might argue that consciousness is on/​off, and then amount of experience is unrelated to whether or not those experiences are being turned into qualia.

I’m trying to give potential responses to my arguments, but I don’t want to strawman EY so I ought to point out that there are lots of other counter-arguments to this he might have, which might be more insightful than my imagined ones.

Inner Listeners

EY talks a lot about “inner listeners”, and mentions that a good theory should be able to have them arise naturally in some way. I agree with this point, and I do agree that his views provide a possible explanation as to what produces an inner listener.

Where I disagree is that we 100% need a separate “information processing” and “inner listener” module. The chicken-conscious, GPT-3-unconscious model seems to make sense from the following perspective:

Some methods of processing input data cause consciousness and some don’t. We know that chickens process input data in a very similar way to humans (by virtue of being made of neurons) and we know that GPT-3 doesn’t process information in that way (by virtue of not being made of neurons). I guess this is related to the binding problem.


But what surprises me the most about EY’s position is his confidence in it. He claims to have never seen any good alternatives to his own model. But that’s simply a statement about the other beliefs he’s seen, not a statement about all hypothesis-space. I even strongly agree with the first part of his original tweet! I do suspect most people who believe chickens are conscious but GPT-3 isn’t believe it for bad reasons! And the quality of replies is generally poor.

EY’s argument strikes me as oddly specific. There are lots of things which human brains do (or we have some uncertainty of them doing) which are kind of weird:

  • Predictive processing and coding

  • Integrating sensory data together (the binding problem)

  • Come up with models of the world (including itself)

  • All those connectome-specific harmonic wave things

  • React to stimuli in various reinforcement-y ways

EY has picked out one thing (self modelling) and decided that it alone is the source of consciousness. Whether or not he has gone through all the weird and poorly-understood things brains do and ruled them out, I don’t know. Perhaps he has. But he doesn’t mention it in the thesis that he links to to explain his beliefs. He doesn’t even mention that he’s conducted such a search, the closest thing to that being references to his own theory treating qualia as non-mysterious (which is true). I’m just not convinced without him showing his working!


I am confused, and at the end of the day that is a fact about me, not about consciousness. I shouldn’t use my own bamboozlement as strong evidence that EY’s theory is false. On the other hand, the only evidence available (in the absence of experimentation) for an argument not making sense is that people can’t make sense of it.

I don’t think EY’s theory of consciousness is completely absurd. I put about 15% credence in it. I just don’t see what he’s seeing that elevates it to being totally overwhelmingly likely. My own uncertainty is primarily due to the lack of truly good explanations I’ve seen of the form “X could cause consciousness”, combined with the lack of strong arguments made of the form “Here’s why X can’t be the cause of consciousness”. Eliezer sort of presents the first but not the second.

I would love for someone to explain to me why chickens are strongly unlikely to be conscious, so I can go back to eating KFC. I would also generally like to understand consciousness better.