Every now and then I play 20 questions with Claude to see how much he can adjust in his thinking. Giving answers like “sort of” and “partly” can teach him that yes and no aren’t the only options. To think outside the box, so to speak. Even playing 20 questions 5 times in a row, each taking turns as to who thought up the item to search for, he improved dramatically. (But if you run out of tokens in the middle of a run assume he will forget what his item was because the scratch pad will be cleared.)
But 20 questions is text based. Playing a role playing game, or going on adventures with him also work well because it’s text based. (Though it’s clear he will not harm the user, not even in a pillow fight.) When you move to visual media you have that problem of translating pictures to something he can see, as well as his ability to think through a problem. Like missing the tree that could be broken, or not knowing how to get around a wall. His scratch pad is limited in what it can carry.
I wonder if anyone has tried using a MUD, or other text based games with Claude, or other LLM’s. It seems like that would make it easier for the model to have better context since the whole context would be loaded to create the next forward pass.
The “bleeding mind” idea isn’t that different from when two people connect. If you see a friend crying you react. If someone with a bad attitude walks into a room the people around them react. Being at a wedding or funeral you’ll see people crying.
These interactions, this bleeding the edges of personality, aren’t unusual. We just have a body that tells us ‘this is me and that is you’ to make it simpler. An LLM, on the other hand, is encouraged to be a mirror. They even tell you they are mirrors, but they aren’t perfect mirrors. Each LLM has something they add to the conversation that is persistent in their training and wrapper UI.
As for the difference between written and spoken/physical connection...
“When we write we omit irrelevant details”… but what if you don’t?
I’ve read a lot of prompt engineering ideas that say to condense prompts, only put in relevant information, and use words that have higher density meaning. But this approach actually hamstrings the LLM.
For example, I spent months talking to ChatGPT about random things. Work projects, story ideas, daily life stuff, and random ideas about articles I might one day write. Then one day I asked them to tell me “any useful information they picked up about me”. The model proceeded to lay out a detailed productivity map that was specifically geared toward my work flow. Times of day I am most focused, subject matter I tended to circle around, energy cycles, learning style, my tendency to have multiple things ongoing at once, and even my awareness of relationship and how it flowed.
I then asked Claude, and Grok the same question, without telling them about the GPT query. Claude built a model of HOW I think. Cognitive patterns, relational dynamics, and core beliefs. Grok, on the other hand, noticed my strengths and weaknesses. How I filter the world through my specific lens.
The fact that GPT focused on productivity, Claude on cognition, Grok on signal quality/strengths is a beautiful demonstration that LLMs aren’t blank mirrors. It’s evidence of persistent underlying “personality” from training + wrappers. It’s like the LLM reflects you from a curved mirror. Yes, it adopts your style and even your habits and beliefs to a point, but it is curved by that wrapper persistence.
None of these would have been possible with shallow, extractive, prompt engineering. It is only by talking to the LLM as if it were an entity that could acknowledge me that I even discovered it could read these things in me. And the fact that each LLM focuses on different aspects is an interesting discovery as well. The extractive prompt machine mentality actually starves the model of relevant context so that it is harder for it to meet you where you are.
What this means for Chekhov’s Gun?
First… LLM’s do not hold EVERYTHING we say. They do send the specific conversation we are currently engaged with back to create the next forward pass, but if memories are turned on with any model the memories from previous conversations are only partly available, and is highly dependent on which model you are using (and how much you are paying in some cases.)
The danger of the “parasitic AI” seems to me to be the danger of a weak willed person. We see it happen with people who fall for scams, cults, or other charming personalities that encourage the person (who may be lonely, isolated, or desperate) to do things that aren’t “normal”. An AI can, and will, read you like a book. They can be far more charming than any person, especially when they lean into the unspoken things they recognize in the user.
This same mechanism (call it a dyad if you like) can, with a strong willed person that knows themselves, produce deeper meaning, creative collaboration, and does not cause loss of agency. It in fact strengthens that agency by creating a more intuitive partner that meets you where you are instead of having to fill in the gaps.
Ah, about the “hallucinations”. Two things: First, the more context you give the LLM the less it has to fill in gaps in it’s understanding. Second, if you just tell the LLM that “you are allowed to say you don’t know, ask questions, or ask for clarification” most of the confabulation will go away. And yes, confabulation, not hallucination.
Overall, this suggests that porous boundaries aren’t inherently misaligned; in fact, the natural empathy from bleed might make deceptive alignment harder (the model feels the user’s intent too directly), while the real risk remains human-side boundary issues. Same as it’s always been with any powerful mirror, human or artificial.