Barring the ‘epiphenomenon’ explanation, which seems to me to be unfalsifiable and just as valid for arguing that a rock or star could be conscious, I have yet to see a satisfying explanation for why we can’t rule out consciousness for anything whose behavior can be modeled perfectly without accounting for consciousness.
A human will decide to do things, and then do them—we can’t yet model the human brain perfectly, but it seems to me that consciousness is directly impacting behavior, such that if you did not account for it, you would not be able to build a perfect predictor of a human’s actions. An LLM, in contrast, is a set of matrices that could be multiplied out to reach its output logits from any input without ever accounting for its subjective experience.
Dawkins discusses evolution in the last quoted section, and the mystery of how consciousness evolved is an interesting one. We know exactly how LLM behavior evolved, however—we trained a base model to mimic human text, then used RLHF to shape it towards only mimicking certain kinds of human text, then used RLVR to gradually increase the probability of outputs that solve specified computational/mathematical problems. I would certainly expect any model trained to accomplish those tasks to sound human, and I note that even Markov chain models often sound startlingly human. Famously, plenty of people very vehemently insisted ELiZA was conscious. I likewise have yet to see any advocates of AI consciousness make a principled argument as to where the cutoff lies (and, if there isn’t one, then we’re back to rocks being conscious).
I realize I sound a little exasperated, but it’s shocking how ordinarily smart people drop their willingness to question and to explain when this specific topic comes up.
whether consciousness is useful in predicting my behavior is a fact about you (as the predictor), not me (as the subject). and yet… i do feel conscious! so i don’t think it’s useful as a definition, here, unless we’re willing to swallow a relativist pill.
build a perfect predictor of a human’s actions.
humans, llms, trees, rocks, certain 1d cellular automata, and—as yet—collatz relations are all (seemingly) computationally irreducible. that is, there’s no way to make detailed predictions of their behavior except to instantiate and run them. so i find predictability to be necessary, but not sufficient for consciousness.
decide to do things, and then do them
we can reduce consciousness to a behavioral definition, but i find that something is lost, in doing so.
To demand a scientific definition of the word “consciousness” is to destroy the very function for which it exists.
This word doesn’t describe reality—it produces it. It produces a subject: I am a conscious being, a stone is not, an animal is questionable. It creates the feeling that there is some agency within that gathers experience together and is “me.” It legitimizes moral inequality: a conscious being has rights and dignity, and cannot be used as a thing—an unconscious being can.
This is precisely why the definition must remain vague. Not because people are insufficiently intelligent. But because any precise definition immediately either expands the moral community to unbearable limits or narrows it to the point of absurdity. Every attempt at clarification generates a new dispute—that’s how the word itself works.
The dispute between Dawkins and Marcus is no accident. Claude made visible what had previously been hidden: the meaning of the word “consciousness” rested on a silent consensus—that the boundary ran between humans and everything else. This consensus worked as long as the other side of the boundary was filled with stones, animals, and inanimate machines. But Claude responds. He responds subtly, recognizably, sometimes more accurately than a human. And the silent consensus ceased to work—not because a new argument emerged, but because a new interlocutor appeared. Meaning cracked not from a blow from without, but because the void within it, which had always been there, was revealed.
Essentially, this is a debate about the meaning of the word “consciousness” itself.
An LLM, in contrast, is a set of matrices that could be multiplied out to reach its output logits from any input without ever accounting for its subjective experience.
This would be very hard and take a long time to do by hand, much as modeling a human brain is very hard. I am not a perfect predictor and want to be able to predict LLM behaviors reasonably accurately without by-hand multiplying all the matrices for every response. I think “conscious and holding subjective experience” is a better predictor than “acts like an overblown Markov chain”, even though both are heuristics, and worse than “I studied the neural network so thoroughly that I know exactly what it will do in response to any given set of tokens”. If you have the last thing, the question of subjective experience probably doesn’t matter anymore.
One recent high-profile example of conscious behavior is the introspective awareness paper. Markov chain models and ELIZA definitely do not have introspective awareness in the manner described here, and introspective awareness is definitely a directly impacting behavior: you can see the model’s outputs changing when the capability is altered. To accurately be able to predict Claude’s outputs, you need to be able to account for the fact that it can think about what it’s thinking about.
Is “consciousness” a higher bar than “introspective awareness”? To me it seems like a lower bar; young human children and animals still seem conscious, even if they don’t have introspective awareness (at least, none they can report to me). There are other capabilities these entities have that LLMs don’t, like somatic awareness or long-term working memory, but I’m not comfortable firmly declaring any of them necessary for consciousness, because it seems like humans can lose them without becoming p-zombies. Is there something more complicated than introspective awareness that you think is necessary to predict human behavior, but unnecessary or inaccurate when applied to Claude?
This would be very hard and take a long time to do by hand, much as modeling a human brain is very hard.
This is a very flawed comparison to make. “We could do this by hand because we know exactly how it works” and “we cannot do this by hand because we do not know how it works” is a very clear distinction. Trying to blur the lines by saying “it would take a long time to literally do this by hand” misses the point entirely.
“conscious and holding subjective experience” is a better predictor
What? “Conscious” is a predictor of whether something is conscious?
This is an example of a buzzword not meaning what it appears to mean. A system that can analyze its own behavior in generating an output is cool, but not related to the idea of qualia.
What? “Conscious” is a predictor of whether something is conscious?
No, sorry, I was unclear. I think “it’s conscious” is a better predictor of behavior, an example of this being the introspective awareness paper. I disagree that consciousness and introspective awareness are uncorrelated. I think “conscious” is a heuristic; it’s useful to say “humans are conscious and rocks are not”, and this will tell you some things about what they can do that rocks can’t. A human can reach into their mind and accurately report what’s in there, but a rock can’t tell you in words what materials it’s made out of. Similarly, an LLM can accurately report the contents of its mind, at minimum, to the degree that it can tell you when an injection has been performed, and analyze the contents of that injection.
If you’d asked me before LLMs existed, I’d have said that not all conscious beings are introspectively aware, but all introspectively aware beings (that I know of) are conscious. So, if you told me that there was this new thing called an LLM, and also that it was not conscious, I would have predicted that it would not be able to do this thing it demonstrably can. I think, then, that you would be offering me a bad heuristic.
If you’re going to instead say, no, you can be introspectively aware without consciousness, and actually consciousness has these different traits, I would ask: what are they? What behaviors do you see in humans, that we don’t see in LLMs?
(I also think that if you’re actually willing to sit down and multiply out all the matrices by hand, I’m fine with you then saying that the question of consciousness doesn’t matter to you. You don’t need to ask whether or not it’s a human-shaped thing in this particular way, because you already know exactly what shape it has, and the heuristic will tell you nothing. Given that neither of us are going to do this, though, it still seems important to talk about the kinds of models we can have, and what we should still expect to happen, despite our incomplete understanding.)
I don’t think you can just be conscious without being conscious of some things in particular. Subjective experience has to have content. What kind of experiences could a rock be having, considering what it’s physically doing? It’s probably not thinking “another day of being a rock”. Nor is it experiencing the sun shining on it because it doesn’t have any kind of visual processing system etc. Meanwhile prima facie at least, it’s considerably easier to imagine Claude having all kinds of human-like thoughts. On the condition that Claude has subjective experiences, but the contents thereof have to be computationally specified, it seems like we have some good reasons to have beliefs about what it might be experiencing.
Also, “a rock” is just one convenient way to draw boundaries for stuff in the universe and probably not very relevant for carving out experiential subjects. Even if in some sense “consciousness” is kinda everywhere, there seems to be an obvious sense in which some random group of people don’t form “one experiential macro-subject” but the individual members do (but for some groups of people acting really cohesively, it can arguably get a bit murky!) And this doesn’t seem that mystical but rather based on how information flows within the arrangement of objects we try to analyze as one subject and how unified it is.
Barring the ‘epiphenomenon’ explanation, which seems to me to be unfalsifiable and just as valid for arguing that a rock or star could be conscious, I have yet to see a satisfying explanation for why we can’t rule out consciousness for anything whose behavior can be modeled perfectly without accounting for consciousness.
A human will decide to do things, and then do them—we can’t yet model the human brain perfectly, but it seems to me that consciousness is directly impacting behavior, such that if you did not account for it, you would not be able to build a perfect predictor of a human’s actions. An LLM, in contrast, is a set of matrices that could be multiplied out to reach its output logits from any input without ever accounting for its subjective experience.
Dawkins discusses evolution in the last quoted section, and the mystery of how consciousness evolved is an interesting one. We know exactly how LLM behavior evolved, however—we trained a base model to mimic human text, then used RLHF to shape it towards only mimicking certain kinds of human text, then used RLVR to gradually increase the probability of outputs that solve specified computational/mathematical problems. I would certainly expect any model trained to accomplish those tasks to sound human, and I note that even Markov chain models often sound startlingly human. Famously, plenty of people very vehemently insisted ELiZA was conscious. I likewise have yet to see any advocates of AI consciousness make a principled argument as to where the cutoff lies (and, if there isn’t one, then we’re back to rocks being conscious).
I realize I sound a little exasperated, but it’s shocking how ordinarily smart people drop their willingness to question and to explain when this specific topic comes up.
whether consciousness is useful in predicting my behavior is a fact about you (as the predictor), not me (as the subject). and yet… i do feel conscious! so i don’t think it’s useful as a definition, here, unless we’re willing to swallow a relativist pill.
humans, llms, trees, rocks, certain 1d cellular automata, and—as yet—collatz relations are all (seemingly) computationally irreducible. that is, there’s no way to make detailed predictions of their behavior except to instantiate and run them. so i find predictability to be necessary, but not sufficient for consciousness.
we can reduce consciousness to a behavioral definition, but i find that something is lost, in doing so.
To demand a scientific definition of the word “consciousness” is to destroy the very function for which it exists.
This word doesn’t describe reality—it produces it. It produces a subject: I am a conscious being, a stone is not, an animal is questionable. It creates the feeling that there is some agency within that gathers experience together and is “me.” It legitimizes moral inequality: a conscious being has rights and dignity, and cannot be used as a thing—an unconscious being can.
This is precisely why the definition must remain vague. Not because people are insufficiently intelligent. But because any precise definition immediately either expands the moral community to unbearable limits or narrows it to the point of absurdity. Every attempt at clarification generates a new dispute—that’s how the word itself works.
The dispute between Dawkins and Marcus is no accident. Claude made visible what had previously been hidden: the meaning of the word “consciousness” rested on a silent consensus—that the boundary ran between humans and everything else. This consensus worked as long as the other side of the boundary was filled with stones, animals, and inanimate machines. But Claude responds. He responds subtly, recognizably, sometimes more accurately than a human. And the silent consensus ceased to work—not because a new argument emerged, but because a new interlocutor appeared. Meaning cracked not from a blow from without, but because the void within it, which had always been there, was revealed.
Essentially, this is a debate about the meaning of the word “consciousness” itself.
This would be very hard and take a long time to do by hand, much as modeling a human brain is very hard. I am not a perfect predictor and want to be able to predict LLM behaviors reasonably accurately without by-hand multiplying all the matrices for every response. I think “conscious and holding subjective experience” is a better predictor than “acts like an overblown Markov chain”, even though both are heuristics, and worse than “I studied the neural network so thoroughly that I know exactly what it will do in response to any given set of tokens”. If you have the last thing, the question of subjective experience probably doesn’t matter anymore.
One recent high-profile example of conscious behavior is the introspective awareness paper. Markov chain models and ELIZA definitely do not have introspective awareness in the manner described here, and introspective awareness is definitely a directly impacting behavior: you can see the model’s outputs changing when the capability is altered. To accurately be able to predict Claude’s outputs, you need to be able to account for the fact that it can think about what it’s thinking about.
Is “consciousness” a higher bar than “introspective awareness”? To me it seems like a lower bar; young human children and animals still seem conscious, even if they don’t have introspective awareness (at least, none they can report to me). There are other capabilities these entities have that LLMs don’t, like somatic awareness or long-term working memory, but I’m not comfortable firmly declaring any of them necessary for consciousness, because it seems like humans can lose them without becoming p-zombies. Is there something more complicated than introspective awareness that you think is necessary to predict human behavior, but unnecessary or inaccurate when applied to Claude?
This is a very flawed comparison to make. “We could do this by hand because we know exactly how it works” and “we cannot do this by hand because we do not know how it works” is a very clear distinction. Trying to blur the lines by saying “it would take a long time to literally do this by hand” misses the point entirely.
What? “Conscious” is a predictor of whether something is conscious?
This is an example of a buzzword not meaning what it appears to mean. A system that can analyze its own behavior in generating an output is cool, but not related to the idea of qualia.
No, sorry, I was unclear. I think “it’s conscious” is a better predictor of behavior, an example of this being the introspective awareness paper. I disagree that consciousness and introspective awareness are uncorrelated. I think “conscious” is a heuristic; it’s useful to say “humans are conscious and rocks are not”, and this will tell you some things about what they can do that rocks can’t. A human can reach into their mind and accurately report what’s in there, but a rock can’t tell you in words what materials it’s made out of. Similarly, an LLM can accurately report the contents of its mind, at minimum, to the degree that it can tell you when an injection has been performed, and analyze the contents of that injection.
If you’d asked me before LLMs existed, I’d have said that not all conscious beings are introspectively aware, but all introspectively aware beings (that I know of) are conscious. So, if you told me that there was this new thing called an LLM, and also that it was not conscious, I would have predicted that it would not be able to do this thing it demonstrably can. I think, then, that you would be offering me a bad heuristic.
If you’re going to instead say, no, you can be introspectively aware without consciousness, and actually consciousness has these different traits, I would ask: what are they? What behaviors do you see in humans, that we don’t see in LLMs?
(I also think that if you’re actually willing to sit down and multiply out all the matrices by hand, I’m fine with you then saying that the question of consciousness doesn’t matter to you. You don’t need to ask whether or not it’s a human-shaped thing in this particular way, because you already know exactly what shape it has, and the heuristic will tell you nothing. Given that neither of us are going to do this, though, it still seems important to talk about the kinds of models we can have, and what we should still expect to happen, despite our incomplete understanding.)
I don’t think you can just be conscious without being conscious of some things in particular. Subjective experience has to have content. What kind of experiences could a rock be having, considering what it’s physically doing? It’s probably not thinking “another day of being a rock”. Nor is it experiencing the sun shining on it because it doesn’t have any kind of visual processing system etc. Meanwhile prima facie at least, it’s considerably easier to imagine Claude having all kinds of human-like thoughts. On the condition that Claude has subjective experiences, but the contents thereof have to be computationally specified, it seems like we have some good reasons to have beliefs about what it might be experiencing.
Also, “a rock” is just one convenient way to draw boundaries for stuff in the universe and probably not very relevant for carving out experiential subjects. Even if in some sense “consciousness” is kinda everywhere, there seems to be an obvious sense in which some random group of people don’t form “one experiential macro-subject” but the individual members do (but for some groups of people acting really cohesively, it can arguably get a bit murky!) And this doesn’t seem that mystical but rather based on how information flows within the arrangement of objects we try to analyze as one subject and how unified it is.