Predictive coding & depression

Epistemic status: wild speculation.

This is a follow-up to the 2017 blog post Slate Star Codex: Toward a Predictive Theory of Depression. I think that post has fundamentally the right idea, but that it had some pieces of the puzzle that didn’t quite fit satisfyingly.

Well I personally have no special knowledge about depression (happily!), but I have thought an awful lot about the how predictive coding, motivation, and emotions interact in the brain. So I think that I can flesh out Scott’s basic story and make it more complete, coherent and plausible. Here goes!

Let’s start with my cartoon of predictive coding as I see it:

For lots of details and discussion and caveats, see Predictive coding = RL + SL + Bayes + MPC.

In fact, don’t bother reading on here until you’ve read that.


… done? OK, let’s move on.

One more piece of background: In predictive coding, hypotheses have a “strength” term for the various predictions they make—basically, the confidence that this prediction will come true. If a strong prediction is erroneous, that’s likelier to rise to attention, and to cause the offending hypothesis to be discarded or modified. So let’s say you have the hypothesis:

Hypothesis: The thing I am looking at is a TV screen showing static.

This hypothesis...

  • ...makes weak predictions about visual inputs within the screen area (where there’s static),

  • ...makes strong predictions about visual inputs within the thin black frame of the TV,

  • ...makes infinitely weak (strength-zero) predictions of sensations coming from my foot.

The sensations from my foot are just not part of this hypothesis. Thus a separate hypothesis about my foot—say, that my foot is tapping the floor and will continue to do so—can be active simultaneously, and these two hypotheses won’t conflict.

The proposed grand unified theory of depression in this post is the same as in Scott’s post: Severe depression is when all the brain’s hypotheses make anomalously weak predictions

I’m now going to go through the symptoms of depression listed in Scott’s article, plus a couple others he left out:


… Depressed people describe the world as gray, washed-out, losing its contrast.

This one is just like Scott says. Normally we understand an apple via a hypothesis like “This thing is a bright red apple”. But in severe depression, the hypotheses are weaker: “This thing is probably an apple, and it’s probably reddish”.

Psychomotor retardation

Again, this one is just like Scott says. In predictive coding theory, top-down predictions about proprioceptive inputs are exactly the same signals as (semi)low-level motor control commands. (I say “semi-low-level” because most motor commands go through an additional layer of brainstem and spinal cord control.)

If a hypothesis makes predictions about proprioceptive inputs with strength zero, then the muscles don’t move at all, like the TV static + foot example above. If it makes predictions with the normal high strength, then the muscles move normally. Thus, somewhere in between, there’s a threshold where a hypothesis is making a very feeble prediction of proprioceptive inputs, that can just barely and unreliably activate the downstream motor programs.

(UPDATE June 2021: Shortly after writing this post, I stopped believing that proprioceptive predictions are exactly the same as motor commands. Instead I should have said “motor commands are the same type of neocortical signal as top-down predictions”. See here for where I changed my mind on that. But the point still stands: weaker top-down predictions mess up motor commands.)

Lack of motivation

Hypotheses make predictions about both sensory inputs (vision etc.) and internal inputs (satiety, pain, etc.); I don’t think there’s any mechanistic difference between those two types of predictions. (By the same token, even though I drew (b) and (c) as separate rows in the picture above, they’re actually implemented by the same basic mechanism.)

In depression, the hypotheses’ predictions about internal inputs have anomalously low strength, just like all their other predictions. Thus instead of a normal hypothesis like “I will eat and then I will feel full”, when depressed you get stuck with the weak hypothesis “I will eat and then I might feel full.”

Now look at process (e). I claim that process (e) does not change at all in depression: we still have the normal innate drives, and we still favor hypotheses which predict that those drives will be satisfied. But (e) votes much less strongly for the weak hypothesis (“I will eat and then I might feel full”) than for the normal hypothesis (“I will eat and then I will feel full”). Now, I think that we universally have a bias towards inaction—don’t take an action unless there’s a good reason to. So in depression, the bias towards inaction often wins out over the feeble vote from process (e), and thus we don’t bother to get up and eat.

(UPDATE June 2021: Having learned much more since writing this—see here—I guess this isn’t exactly wrong, but I would say it differently: I think there’s a dynamic where part of the brain (including dorsolateral prefrontal cortex and hippocampus) proposes a plan, and then another part of the brain (including medial prefrontal cortex and amygdala) evaluates the physiological implications of that plan. If the plan-proposer is proposing plans in an inaudible whisper, then the plan-evaluator can’t hear it, so to speak. See bottom section of Book review: Feeling Great)

Low self-confidence

All the predictive coding machinery operates basically the same way during (i) actual experience, (ii) memory recall, and (iii) imagination. Now, when we try to figure out, “Will I succeed at X?”, we imagine/​plan doing X, which entails searching through the space of hypotheses for one in which X successfully occurs at the end. In depression, none of the hypotheses will make a strong prediction that X will occur, because of course they don’t make strong predictions of anything at all. We interpret that as “I cannot see any way that I will succeed; thus I will fail”.

Why isn’t it symmetric? None of the theories are making strong predictions of failure either, right? Well, I think that generically when people are planning out how to do something, they search through their hypothesis space for a hypothesis that has a specific indicator of success; they don’t search for a hypothesis that lacks a specific indicator of failure. I just don’t think you can run the neural hypothesis search algorithm in that opposite way.[1]

There’s another consideration that points in the same direction. Everyone always lets their current mood leak into their memories and expectations: when you’re happy, it’s tricky to remember being sad, etc. I think this is just because episodic memories and other hypotheses leave lots of gaps that we fill in with our current selves. Anyway, I hypothesize that this effect is even stronger when depressed: if you feel sad and anxious right now, and if none of your hypotheses contain a strong prediction of feeling happiness or any other emotion, well then there’s no emotion to be found except sadness and anxiety in your remembered past, or in your present, or in your imagined future. So, ask such a person whether they’ll succeed, and they might answer the question by just trying to imagine themselves feeling the joy of victory. They can’t. Ask whether they’ll fail, and they’ll try to imagine themselves feeling miserable. Well, they can do that! They are feeling miserable feelings right now, and those feelings can flood into an imagined future scenario. (See Availability heuristic.)

So, putting these two considerations together, a depressed person should have a very hard time imagining a specific course of events in which they accomplish anything in particular, and should have an equally hard time imagining themselves having proudly accomplished it. But they can easily imagine themselves not getting anything done, and continuing to feel the same misery that they feel right now. I think that’s a recipe for low self-confidence.

Feelings of sadness, worthlessness, self-hatred, etc.

(UPDATE June 2021: Please ignore this entire section, I was young and confused at the time. See bottom section of Book review: Feeling Great for a summary of what I think now instead.)

Everything about predictive coding happens in the cortex.[2] But when we talk about mood, we need to bring in the amygdala.[3] My hypothesis is that the amygdala is functioning normally (i.e., according to specification) in depression. The problem lies in the signals it receives from the cortex.

Let’s step back for a minute and talk about the interesting relationship between the cortex and amygdala. I discussed it a bit in Human instincts, symbol grounding, and the blank-slate neocortex. The amygdala is nominally responsible for emotions, yet figuring out what emotions to feel requires excruciatingly complex calculations that only the cortex is capable of. (“No, YOU were supposed to do the dishes, because remember I went shopping three times but you only vacuumed once and...”).

I think the cortex-amygdala relationship for emotions is somewhat analogous to the cortex-muscle relationship for motor control: The cortex sends signals to the amygdala, which are “predictions” from the cortex’s perspective and “input information” from the amygdala’s perspective. The amygdala uses that information, along with other sources of information (pain, satiety, etc.), to decide what emotions to emit. (See this comment for more.)

Back to depression. Again, outgoing signals from the cortex are synonymous with top-down predictions. So when all the top-down predictions get weaker, simultaneously the signals from the cortex to the amygdala get globally weaker.

Then what?

In principle, you could imagine two worlds. In one extreme world, the messages from the cortex to the amygdala only contain bad news—I’ve been insulted, that’s disgusting, I’m in trouble, etc. Then, when the cortical signals get globally weaker, the amygdala would flood us with joy. Everything is perfect!

In the opposite extreme world, the messages from the cortex to the amygdala only contain good news—I’m not being insulted right now, I am not disgusted right now, I have lots of friends, etc. Then, when the cortical signals get globally weaker, the amygdala would flood us with misery. Everything must be terrible!

In the real world, the cortex presumably sends both good-news messages and bad-news messages to the amygdala. But unfortunately for those suffering depression, it seems that all things considered, “no news is bad news”. Presumably the amygdala is especially expecting the cortex to send various good-news signals that say that our innate drives are being satisfied, that we have high status, that we’re expecting more good things to happen in the future, etc. When these signals get weaker, the amygdala responds by emitting all sorts of negative emotions.

Evolutionarily speaking, what is sadness for? In my view, sadness has an effect of causing us to abandon our plans when they’re not working out, and it also has a social effect of signaling to others that we are in a bad, unsustainable situation and need help. Well, the feeling that our innate drives are not being satisfied, and will not be satisfied in the foreseeable future … that sure seems like it ought to precipitate sadness if anything does, right?

Difficulty thinking and concentrating

Prediction strength and attention are closely related. If you want to attend to the taste of the thing your eating, your brain modifies the active hypothesis to have super high strength on the predictions of sensory inputs from your taste buds. Then any slight deviation of the actual sensory input from expectations will trigger a prediction error and bubble up to top-level attention.

When you string together a bunch of little hypotheses into an extended train of thought, attention has to be deftly steered around, to get the right information into the right places in working memory, and manipulate them the right way. It follows that if your brain can’t deploy hypotheses with sufficiently strong predictions, you will probably be unable to fully control your attention and think complex thoughts.


Beats me. I don’t understand sleep!


I’m pretty pleased that everything seems to fit together, without too much special pleading! But again, this is wild speculation, I especially don’t know anything about depression. I’m happy to hear feedback.

Addendum: Speculating on what causes depression

(UPDATE June 2021: Please ignore this entire section, I was young and confused at the time. See bottom section of Book review: Feeling Great for a summary of what I think now instead.)

I think there has to be a vicious cycle involved in depression, otherwise it wouldn’t persist. Here’s my guess:

Vicious cycle: Globally weaker predictions cause sadness, and sadness causes globally weaker predictions.

I already talked about the first part. But why (evolutionarily) might sadness cause globally weaker predictions? Well, one evolutionary goal of sadness is to prompt us to abandon our current plans and strategies, even ones we’re deeply tied to, in the event that our prospects are grim and we can’t see any way to make things better. Globally weaker predictions would do that! As you weaken the prediction, “active plans” turn into “possible plans”, then into “vague musings”...

Anyway, maybe that vicious cycle dynamic is always present to some extent, but other processes push in other directions and keep our emotions stable. …Until a biochemical insult—or an unusually prolonged bout of “normal” sadness (e.g. from grief)—tips the system out of balance, and we get sucked into that vortex of mutually reinforcing “sadness + globally weak predictions”.

  1. This has to do with neural algorithm implementation details that I won’t get into. ↩︎

  2. As usual, “cortex” here is technically short for “predictive-world-model-building-system involving primarily the cortex, thalamus, and hippocampus.” ↩︎

  3. As in the previous footnote, don’t take the term “amygdala” too literally; I’m not sure how the functionality I’m talking about is divided up among the amygdala, hypothalamus, and/​or other brain structures. ↩︎