Predictive Processing: Conscious when Training

Last week I started learning about theories of consciousness from the Cambridge Digital Minds course.

One theory in particular that stuck in my mind is predictive processing. It says that all conscious minds have an internal world model and we are constantly taking in stimuli and predicting what comes next and we update our world model based on the outcome. The prediction error is what is used to update our neural weights and the conscious experience follows from that.

A great example is that predicting what happens when we drop an apple, we’re in a constant feedback loop of receiving stimuli and making predictions of what will happen in the next moment, consciously or subconsciously.

I find it pretty interesting since it has some nice analogies to LLMs, especially for training via next token prediction and having an internal world model. One interesting implication is that, if LLMs were conscious today, they would only be conscious via training.

At first this felt a little inaccurate but it does make sense for a sufficiently advanced AI, it would be expected it to have a persistent memory and goals and be able to update its long-term memory based on its environment.

This then got me thinking, if current LLMs were conscious whilst pre-training and post-training, when exactly would consciousness emerge? I assume that a model with randomly initial weights or unable to create grammatically correct sentences is definitely not intelligent, let alone conscious.

I was thinking that it would emerge at some point in the post-training phase but that’s not necessarily true. The persona fine tuning redirects to the base LLM from simulating anything into simulate a certain assistant persona. If consciousness occurs after the fine-tuning, then the base LLM must have had the ability to simulate a conscious persona in the first place.^[1]

I predict that the base LLM must get the capability to simulate conscious personas somewhere within its pretraining, I assume as one of the last properties it gets (perhaps after grokking?) and then the fine-tuning gets the LLM to robustly simulate those personas across different contexts.

This of course brings the question of the difference between a real and simulated persona which is a question for another time, which Chalmers et al. offer some insight into.

What would it look like from the perceptive of the LLM?

I assume it would be pretty odd. At the post-training phase, the stimuli would be input tokens, your response would be rollouts and you would feel good or bad based on the positive and negative rewards.

Are there any LLMs that have continual learning?

There has been lots of incentive to add in long-term continual learning into LLMs, with some recent attempts at it.

Something like Tiwari et al. 2026, where an architectural change could allow LLMs to update their weights without catastrophic forgetting and loss of plasticity.

I did see this post a while ago about how you can’t imitation learn how to continual learn, which may be even more impactful if continual learning is necessary for consciousness.

How does OpenClaw and memory via scaffolding play into this?

I assume that memory via vector databases is insufficient, even if it models human memories accurately. This is because the process of learning needs to update the model’s internal world model and its ability to predict processes internally rather than from context.

Is predictive processing likely to be correct or useful?

At first, I would say that predictive processing has the same flaws as the other theories of consciousness which is that they would require a sufficiently advanced level of mech interp to be able to probe model activations to determine if the right internal representations exist. I believe there is some work on defining indicators for predictive processing such as Kirchhoff et al. using KL divergences to measure prediction errors but it still seems nascent.

Happy to hear people’s thoughts on this and experience with predictive processes!

^
I initially thought that this would be a case against Cyborgism, since a wrong prompt can accidentally create and destroy a conscious AI. However this is not the case because the base LLM does not learn or update from the prompts.