dmac_93 comments on Reward Function Design: a starter pack

dmac_93 6 Feb 2026 19:31 UTC
2 points
0
My theory is that the brain uses both reinforcement learning and closed loop control. Then the brain uses the closed loop controller’s error to generate reward signals endogenously.
That is to say: a reward is given when the closed loop controller reaches its setpoint, and a penalty is given if it moves too far from its setpoint.
- Steven Byrnes 7 Feb 2026 16:29 UTC
  4 points
  0
  Parent
  Yeah I agree with that. I have a diagram of homeostatic feedback control in §1.5 of my SMTM reply post, and RL is one of the ingredients (d & f).
  - dmac_93 11 Feb 2026 15:39 UTC
    1 point
    0
    Parent
    If i could pull a nugget of truth out of SMST’s work, it would be that the brain is a control system. There are many different types of control system and the brain probably uses all of them. For example the spinal cord alone contains closed loop controllers (for controlling muscle forces and positions), open loop controllers (for pain withdrawl reflexes), and finite state machines (for walking & running on four legs).
    The question is, how does the brain use RL to implement a control system? And how does that interface with the other control systems in the brain?