Chris van Merwijk comments on Why I’m not into the Free Energy Principle

Chris van Merwijk 15 Jul 2025 5:02 UTC
2 points
0
Im curious what @Steven Byrnes has to say about the Kagan et al, Oct 2022 paper that @Roman Leventov mentioned.
My summary (I’m uncertain as its a bit unclearly written I find) is that they
1. Put a bunch of undifferentiated and unconnected human/mice cortical neurons in a petridish with electrodes connected to a computer, playing the Pong game.
2. Some of the electrodes activated a high voltage when the ball was at that relative position (place encoding).
3. Some electrodes measured electrical signals that then caused the bar/racket to move.
4. Whenever the racket missed the ball, 4 seconds of uniform random noise was input instead of the place encoding.
And this caused the cells to learn to play pong correctly, i.e. not miss the ball.
Isn’t this concrete evidence in favor of active inference? This seems like evidence that the cortex is doing active inference at a very micro level? Because the neural circuit in the petridish is not merely making predictions about what sensory observations it will see, but actually taking actions to minimize prediction error later. We could try to look at the implementation details of how this happens and then it might turn out to work by a feedback control system, but I don’t know the implementation details. My best guess for how to model this behaviour would be essentially that the whole neural net is optimizing itself so as to minimize its average prediction error over time, and where prediction error is actually a hardcoded variable in the neural net (I don’t know what, maybe some number of proteins of a certain type inside the soma, or perhaps just the firing rate, I don’t know enough about neuroscience).

I’m not sure about any of this. But it really does seem like the neural network ends up taking actions to minimize its later prediction error, without any kind of RL. It basically seems like the outputs of the neurons are all jointly optimized to minimze average prediction error over time within the Pong environment. And that is exactly what active inference claims, as far as I understand (but I haven’t studied it much). And to be clear, afaict it is completely possible that (in the brain, not in this petridish) on top of this active inference system there is RL happening, so this doesn’t mean predictive processing + active inference (+ free energy principle maybe idk) is a unified theory of the brain, but maybe it still is a correct theory of the (neo-)cortex?

(Of course that doesn’t mean that the free energy principle is the right tool, but it might be the right tool for an active inference system even though it’s overkill for a thermostat. This is not my main question though).
- Steven Byrnes 15 Jul 2025 10:21 UTC
  5 points
  0
  Parent
  My current belief is that the neurons-in-a-dish did not actually learn to play Pong, but rather the authors make it kinda look that way by p-hacking and cherry-picking. You can see me complaining here, tailcalled here, and Seth Herd also independently mentioned to me that he looked into it once and wound up skeptical. Some group wrote up a more effortful and official-looking criticism paper here.
  - Chris van Merwijk 15 Jul 2025 17:04 UTC
    5 points
    0
    Parent
    Oh, I didn’t expect you to deny the evidence, interesting. Before I look into it more to try to verify/falsify (which I may or may not do), suppose that it turns out this general method does in fact, i.e. it learns to play pong, or at least in some other experiment it learns using this exact mechanism, would that be a crux? I.e. would that make you significantly update towards active inference being a useful and correct theory of the (neo-)cortex?
    EDIT: the paper in your last link seems to be a purely semantic criticism of the paper’s usage of words like “sentience” and “intelligence”. They do not provide any analysis at all of the actual experiment performed.
    - Steven Byrnes 16 Jul 2025 2:42 UTC
      10 points
      0
      Parent
      would that be a crux?
      No. …I’m just gonna start ranting, sorry for any mischaracterizations…
      For one thing, I think the whole experimental concept is terrible. I think that a learning algorithm is a complex and exquisitely-designed machine. While the brain doesn’t do backprop, backprop is still a good example of how “updating a trained model to work better than before” takes a lot more than a big soup of neurons with Hebbian learning. Backprop requires systematically doing a lot of specific calculations and passing the results around in specific ways and so on.
      So you look into the cortex, and you can see the layers and the minicolumns and the cortex-thalamus-cortex connections and so on. It seems really obvious to me that there’s a complex genetically-designed machine here. My claim is that it’s a machine that implements a learning algorithm and queries the trained model. So obviously (from my perspective), there are going to be lots of synapses that are written into the genetic blueprint of this learning-and-querying machine, and lots of other synapses that are part of the trained model that this machine is editing and querying.
      In ML, it’s really obvious which bits of information and computation are part of the human-created learning algorithm and which bits are part of the trained model, because we wrote the algorithm ourselves. But in the brain, everything is just neurons and synapses, and it’s not obvious what’s what.
      Anyway, treating neurons-in-a-dish as evidence for how the brain works at an algorithmic level is like taking a car, totally disassembling it, and putting all the bolts and wheels and sheet-metal etc. into a big dumpster, and shaking it around, and seeing if it can drive. Hey, one of the wheels is rolling around a bit, let’s publish. :-P
      (If you’re putting neurons in a dish in order to study some low-level biochemical thing like synaptic vesicles, fine. Likewise, you can legitimately learn about the shape and strength of nuts and bolts by studying a totally-disassembled-car-in-a-dumpster. But you won’t get to see anything like a working car engine!)
      The brain has thousands of neuron types. Perhaps it’s true that if you put one type of neuron in a dish then it does (mediocre) reinforcement learning, where a uniform-random 150mV 5Hz stimulation is treated by the neurons as negative reward, and where a nonrandom 75mV 100Hz stimulation is treated by the neurons as positive reward. I don’t think it’s true, but suppose that were the case. Then my take would be: “OK cool, whatever.” If that were true, I would strongly guess that the reason that the two stimulation types had different effects was their different waveforms, which somehow interacts with neuron electrophysiology, as opposed to the fact that one is more “predictable” than the other. And if it turned out that I’m wrong about that (i.e., if many experiments showed that “unpredictability” is really load-bearing), then I would guess that it’s some incidental result that doesn’t generalize to every other neuron type in the brain. And if that guess turned out wrong too, I still wouldn’t care, for the same reason that car engine behavior is quite different when it’s properly assembled versus when all its parts are disconnected in a big pile.
      Even putting all that aside, the idea that the brain takes actions to minimize prediction error is transparently false. Just think about everyday life: Sometimes it’s unpleasant to feel confused. But other times it’s delightful to feel confused!—we feel mesmerized and delighted by toys that behave in unintuitive ways, or by stage magic. We seek those things out. Not to mention the Dark Room Problem. …And then the FEP people start talking about how the Dark Room Problem is not actually a problem because “surprise” actually means something different and more complicated then “failing to predict what’s about to happen”, blah blah blah. But as soon as you start adding those elaborations, suddenly the Pong experiment is not supporting the theory anymore! Like, the Pong experiment is supposed to prove that neurons reconfigure to avoid impossible-to-predict stimuli, as a very low-level mechanism. Well, if that’s true, then you can’t turn around and redefine “surprise” to include homeostatic errors.
      EDIT: the paper in your last link seems to be a purely semantic criticism of the paper’s usage of words like “sentience” and “intelligence”. They do not provide any analysis at all of the actual experiment performed.
      Ah, true, thanks.
      - Chris van Merwijk 16 Jul 2025 6:19 UTC
        3 points
        0
        Parent
        So I think all of this sounds mostly reasonable (and probably based on a bunch of implicit world-model about the brain I don’t have), especially the longest paragraph makes me update.
        
        I think whether I agree with this view really depends heavily on quantitatively how well these brain-in-a-dish systems perform which I don’t know so I’ll look into it more first.