Large Language Models Live in Time

Crossposted from my Substack.

Epistemic status: wild brainstorming.

LLMs don’t seem to live in time. That is, they don’t seem to have a continuous personal identity the way we typically understand this for humans. My claim here, however, is that the question of whether LLMs somehow represent or track temporal structure is still a meaningful one and likely plays a role in explaining and predicting how these systems work. My goal is to argue for the need to do more empirical work that focuses on LLM time, i.e., LLM neurochronometry studies, and share three key intuitions drawing on connections to neuroscience. I’m nowhere near offering a research agenda here. Still, I’d be interested in hearing thoughts on what such an agenda could look like, especially in light of work on notions of LLM selfhood and multi-agent interactive environments that enable the development of different LLM personas, such as the AI Village or moltbook.

The LLM—Calendar Time Asymmetry

In the Scheming AIs report, Carlsmith points out the uncertainty about how models will think about time at different stages of training. He defines the notion of an episode as “the temporal horizon that the gradients in training actively pressure the model to optimize over” and suggests that this may differ from what would intuitively count as an episode, more of a single interaction kind of phenomenon. Carlsmith flags that there is no reason to assume that the model will, by default, think in calendar time terms, especially if it does not have situational awareness. He then goes on to talk about other units that would be more natural for LLMs to think in terms of when it comes to time. These are:

  • Time steps in a simulated environment: useful for RL setups.

  • Tokens per user interaction: tightly connected to what the model can condition on and how attention effects unfold.

  • Forward passes: useful for questions like when a representation becomes available internally relative to the produced text.

These are all very reasonable candidates for measuring time if you are an LLM, according to Carlsmith. He also notes that these units have no need to be in sync with calendar time. Because, for example, training might pause and simulated environments feature all sorts of speeds: users may take their time to respond in chats. And so there is no 1:1 correspondence between these units and calendar time.

I consider three neuroscience-inspired pathways to think about the asymmetry between how LLMs think about time.

Three Neuroscience Parallels

1. Neurochronometry and interpretability

In the brain sciences, neurochronometry is the study of the timing of neural processes that give rise to perception, cognition, and action, i.e., mapping when specific computations happen in the brain and how these timings relate to behavior. I think that as AI models become more capable, studies on AI neurochronometry will become essential for examining the connections between a model’s neural processes and its behavior. This seems to be already part of interpretability: try to find how the activations at the algorithmic level correspond to model behaviors. It also seems pretty challenging.

Mechanistic interpretability potentially acts as the equivalent of EEG or fMRI. The parallels here are contentious, but broadly, there are arguments for:

1) EEG analogues to be taken as the activation trajectories across layers or tokens, and

2) fMRI analogues to be mappings of representations to components via different methods of dimensionality reduction, e.g., PCA, t-SNE.

2. Left-brain interpreter and chain-of-thought

In neuropsychology, Michael Gazzaniga’s “left-brain interpreter” hypothesis suggests that our verbal centers are constantly coming up with post-hoc narratives to make sense of actions initiated by other, non-verbal parts of the brain. We do something, and then we tell ourselves a story about why we did it to maintain a coherent sense of self.

Chain-of-thought in LLMs may be working in a similar way. There’s no guarantee that when the model “shows its thoughts” it’s verbalizing what’s really going on internally. Especially for behaviors relevant for strategic deception, the model is typically incentivized to do the opposite.

By engaging in chain-of-thought, the LLM creates a temporary history for its current response. This allows it to “remember” its own logic from three sentences ago, simulating a form of short-term temporal identity that wouldn’t exist in a zero-shot prompt. It could be that the “decision” to move toward a specific conclusion happens long before the chain-of-thought has reached that conclusion.

3. Exploiting a “readiness potential”

If there is a “time gap” between the neural computation and its verbalization, it could be that LLMs also have what Benjamin Libet discovered to be the “readiness potential”. In Libet’s experiment, participants were requested to perform a simple motor action whenever they felt the urge to do so, while recording their brain activity. Libet compared the times of the recorded brain activity and the motion itself and found that there was a short window between the conscious awareness of the urge to move and the motion.

The model could “know” about what it’s going to do in earlier layers, way before the tokens have been generated. There could thus be a gap between the intention to do X and X itself. In many safety-relevant scenarios we could exploit that gap to catch the model being strategically deceptive way before it actually behaves deceptively.

Moreover, in multi-agent environments, the time it takes for an agent to process information could be playing a catalytic role in shaping its persona. An agent that thinks before speaking could be much more deliberate and in effect, more capable than one that doesn’t.

How Could These Observations Be Useful

LLM time is likely relevant for:

  • The model’s self-perception: How a system constructs a story about itself through time which influences its persona.

  • A system’s perception and ability to model and relate to other agents or objects within time.

  • The model’s capabilities related to long-term planning and reasoning, and hence strategic deception.

  • Modeling aspects of AI welfare that require a notion of subjective experience.

Takeaways

  • Even if there’s no persistent self across interactions, models can still represent temporal structure in ways that shape behavior.

  • Chain-of-thought is likely not causal and thus not reliable.

  • Timing gaps may be safety-relevant and could be leveraged for detecting emerging behaviors.

  • More research on operationalization could turn “LLM time” into measurable variables and experiments, e.g., manipulate episode boundaries, pauses, memory, and compute expenditure.

No comments.