Jon Garcia comments on A Test for Language Model Consciousness

Jon Garcia 26 Aug 2022 5:18 UTC
2 points
0
This is an interesting proposal. I just have a few thoughts.
I’ve come to see consciousness as the brain’s way of constructing internal narratives of experience. When I am conscious of something, I can recall it later or describe it to someone else. Most cognitive processes are unconscious not because they have no impact on our thoughts but rather because our brains don’t record a narrative of what they’re doing.
Consciousness is not the same thing as awareness or even self-awareness. I can be paying “attention” to the road unconsciously and not recall a single thing about the drive at the end of it because the route was so familiar and nothing noteworthy occurred. A robot can have an internal model of its own body position, but that type of self-awareness doesn’t feel like what we mean by the word “consciousness”. Conversely, I would argue that most nonhuman animals are conscious in a nontrivial way, even if they don’t have the language to report on it and even if they fail the mirror test of “self-awareness”. Furthermore, even a mind without long-term memory can still be considered conscious from moment to moment, in that the mind is continually constructing narratives of what it experiences, but the lack of access to previously recorded narratives makes it seem to that mind that those experiences were unconscious.
Language models can certainly construct narratives in one sense, but those narratives typically do not map to any internal narratives of anything other than the relationships among the words that came before. They are trained to say what they predict a human would say based on context. I’m not saying that language models can’t be conscious. In fact, I think they are currently the closest models we have to conscious algorithms, especially those that ground their symbols. However, I am saying that self-reports of consciousness are not what we should look for. Sure, it’s correlated with consciousness in humans (as are self-awareness, sentience, etc.), but that doesn’t generalize to an alien cognitive architecture.
I’m not sure what the test would have to be, except that you would have to somehow look for structured patterns of activity, which are generated by what the language model experience and which guide the model’s interpretations and decisions regarding what comes next. A purely text-based language model could only ever have a very low-level form of consciousness, in my opinion, even though language is the hallmark of higher-level consciousness in humans. Based on my understanding, I think we’ll start to see recognizable consciousness in the descendants of models like those that add captions to images. A model that can watch a video and then describe its contents in conversation or follow the instructions contained in the video could almost certainly be called conscious, even if the narratives it can hold in its head are not as sophisticated as those a human can handle.
Finally, on the subject of moral patienthood for conscious algorithms, I would say that consciousness is a necessary but not sufficient condition. I think it would also need to have some form of sentience, such as the ability to experience suffering (aversive behavioral motivation), before it could be said to have moral worth. Again, these are correlated in humans and nonhuman animals, but not (yet) in AI.