(Posting here rather than SSC because I wrote the whole comment in markdown before remembering that SSC doesn’t support it).
We had a guest lecture from Friston last year and I cornered him afterwards to try to get some enlightenment (notes here). I also spent the next few days working through the literature, using a multi-armed bandit bandit as a concrete problem (notes here ).
Very few of the papers have concrete examples. Those that do often skip important parts of the math and use inconsistent/ambiguous notation. He doesn’t seem to have released any of the code for his game-playing examples.
The various papers don’t all even implement the same model—the free energy principle seems to be more a design principle than a specific model.
The wikipedia page doesn’t explain much but at least uses consistent and reasonable notation.
Reinforcement learning or active inference has most of a worked model, and is the closest I’ve found to explaining how utility functions get encoded into meta-priors. It also contains:
When friends and colleagues first come across this conclusion, they invariably respond with; “but that means I should just close my eyes or head for a dark room and stay there”. In one sense this is absolutely right; and is a nice description of going to bed. However, this can only be sustained for a limited amount of time, because the world does not support, in the language of dynamical systems, stable fixed-point attractors. At some point you will experience surprising states (e.g., dehydration or hypoglycaemia). More formally, itinerant dynamics in the environment preclude simple solutions to avoiding surprise; the best one can do is to minimise surprise in the face of stochastic and chaotic sensory perturbations. In short, a necessary condition for an agent to exist is that it adopts a policy that minimizes surprise.
I am leaning towards ‘the emperor has no clothes’. In support of this:
Friston doesn’t explain things well, but nobody else seems to have produced an accessible worked example either, even though many people claim to understand the theory and think that is important.
Nobody seems to have has used this to solve any novel problems, or even to solve well-understood trivial problems.
I can’t find any good mappings/comparisons to existing models. Are there priors that cannot be represented as utility functions, or vice versa? What explore/exploit tradeoffs do free-energy models lead to, or can they encode any given tradeoff?
At this point I’m unwilling to invest any further effort into the area, but I could be re-interested if someone were to produce a python notebook or similar with a working solution for some standard problem (eg multi-armed bandit).
I didn’t see the post itself, but it sounds like Unconscious Thought Theory. The experimental evidence is pretty weak, and imo the theory as it stands is just too poorly specified to really test experimentally.
There is some evidence that offline processing matters for eg motor learning or statistical learning. I haven’t looked in enough detail to know whether to trust it or not.