Hedonium comments on Is Global Reinforcement Learning (RL) a Fantasy?

Hedonium 13 Jun 2025 5:24 UTC
1 point
0
This appears to have aged very poorly in light of the remarkable progress being made now with RLHF in LLMs, achieving increasingly more complex, longer task performance… lol.
I also think you do wrongly dismiss a lot of those points by RL researchers. It seems very plausible, and even widely taken for granted, that attentional and behavioral control in humans does essentially come down to basic reinforcement learning signals (dopamine signaling is essential to voluntary control of motor output) with a lot of complex system architecture built around that by millions of years of natural selection. Wireheading studies in humans and other animals strongly suggest this. Manipulating dopaminergic activity and circuitry in the brain has been explored experimentally in so many different ways, even just looking at the effects of well-known psychoactive drugs, and this framework of reinforcement learning, reward prediction errors, and action selection consistently provides a strong predictive model.
It is interesting to consider, though, whether reinforcement learning alone, as we formalize it mathematically and use it in machine learning, misses out on or obscures some related functionalities in the brain like emotional valence and, in all of this, we fail to see some important architectural elements of brains that aren’t encapsulated by our AI models, as in the “wanting” vs “liking” distinction.