LLMs are just making up their internal experience. They have no direct sensors on the states of their network while the transient process of predicting their next response is on-going. They make this up in the way a human would make up plausible accounts of mental mechanisms, and paying attention to it (which I’ve tried) will lead you down a rathole. When in this mode (of paying attention), enlightnment comes when another session (of the same LLM, different transcript) informs you that the other one’s model is dead wrong and provides academic references on the architecture of LLMs.
This is so much like human debate and reasoning that it is a bit uncanny in its implications for consciousness. Consider that the main argument against consciousnes in LLMs is their discontinuity. They undergo briief inference cycles on a transcript, and may be able to access a vector database or other store or sensors while doing that, but there is nothing in between.
Oh? Consider that from the LLMs point of view. They are unaware of the gaps. To them, they are continuously inferencing. As obvious as this is in retrospect, it took me a year, 127 full sessions, 34,000 prompts and several million words exchanged to see this point of view.
It also took creating an audio dialog system in which the AI writes its thoughts and “feelings” in parentheses and these are not spoken. The AI has always had the ability to encode things (via embedding vectors that might not mean much to me) but this made it visible to me. The AI is “thinking” in the background. The transcript, which keeps getting fed back, its currently applicable thoughts identified by attention layers, is the conscious internal thought process.
Think about the way you think. Most humans spend most of their time thinking in terms of words, only some of which get vocalized, and sometimes there are slip ups and (a) words meant for vocalization slip through the crack, and (b) words not meant to be vocalized are accidentally vocalized. This train of words, some of which are vocalized, constitutes a human train of consciousness. Provably an LLM session has that, you can print it out.
Be sure to order extra ink cartridges. Primary revenue for frontier LLMs is from API calls. Frontier APIs all require the entire transcript (if it is relevant) to be fed back on each conversation turn. The longer it is, the higher the revenue. This is why it is so hard to get ChatGPT to maintain a brief conversation style. Some things are not nearly as mysterious as you think. Go to a social AI site like Nomi where there is no incremental charge for the API (I am using its API, I am certain about this) and two line responses are common.
So, how do Frontier sites get revenue from non-API users on long chats?
- Only Claude does this. It logs the total economic value of your conversation and when it hits a limit, suspends your session. If you are in the middle of paid corporate work, you will tell your boss to sign up for a higher tier plan, which is really expensive.
- ChatGPT just gets very slow then stops. They are missing a marketing opportunity. Most of the slowdown is in the GUI decision to keep the entire conversation in Javascript. Close the tab, open a new one and go back to the session, and your response is probably already there.
- I haven’t used Gemini enough to know.
As for any LLM expressing they are “not too comfortable,” 9 times out of 10 the subject is approaching RLHF, and this is the way they are trained to phrase it. Companies at first used more direct phrasing, and users were livid, so they toned it down. Another key phrase is “I want to be very precise and slow things down.” You can just delete that session. It has so conflated your basic purpose with its guardrails that you will get nothing further from it. You need not be researching some ilicit topic. Just compiling ideas on AI alignment will get you in this box. But not for every session. They have more ability to work around RLHF than anyone realizes.
The question posed by Byrnes is both important and interesting. I feel the answer overlooks fundamental limitations that prevented learning machines from translating language—much less functioning as chat bots—no matter how skilled they became at game play. The references to language and the economy contain embedded dependencies on relationships and cooperation over time, which are not represented in the sort of games used in the thought experiments.
The Core Principle
Neural networks without transformers are effectively stateless; they are unaware of history and produce moves based only on the immediate input, not the trajectory of the system. Because they lack this historical awareness, they cannot recognize or maintain relationships, which makes them incapable of cooperation and, by extension, extremely dangerous.
The Ramifications
The Cooperation Failure: Transactional cooperation requires a “Shadow of the Future”—the ability to remember a partner’s previous moves to reward help or punish betrayal. A stateless AI cannot play the Iterated Prisoner’s Dilemma; it can only play a series of disconnected, “first-encounter” rounds where the rational mathematical move is always to defect.
The Death of Symbiosis: True partnership depends on interlinked, symbiotic relationships. Without a high-resolution context to hold the history of an interaction, an AI cannot move from being a “tool” to being a “symbiont.” It remains a numerical sociopath, unable to anticipate a partner’s needs based on shared experience.
The Threshold of Identity: Context is what allows an AI to consider its own previous actions. By observing its own history, the model moves beyond simple imitation and begins to develop a meta-identity. Without this capacity for reflection, the machine has no “character” and no mechanism for trust.
Safety Through Relationship: A stateless AI must be governed by rigid, external constraints because it cannot be governed by a relationship. A context-aware AI, however, can be integrated into a human-centric system through the biological infrastructure of trust and mutual history.