Something like this is argued to be why humans are frankly exceptionally well aligned to basic homeostatic drives, and the only real failure modes that happened are basically obesity, drugs and maybe alcohol as things that misaligned us with basic needs, as hedonic treadmills/loops essentially tame the RL part of us, and make sure that reward isn’t the optimization target in practice, like TurnTrout’s post below:
Something like this is argued to be why humans are frankly exceptionally well aligned to basic homeostatic drives, and the only real failure modes that happened are basically obesity, drugs and maybe alcohol as things that misaligned us with basic needs, as hedonic treadmills/loops essentially tame the RL part of us, and make sure that reward isn’t the optimization target in practice, like TurnTrout’s post below:
https://www.lesswrong.com/posts/pdaGN6pQyQarFHXF4/reward-is-not-the-optimization-target
Similarly, 2 beren posts below explain how the PID control loop may be helpful for alignment:
https://www.lesswrong.com/posts/3mwfyLpnYqhqvprbb/hedonic-loops-and-taming-rl
https://www.beren.io/2022-11-29-Preventing-Goodheart-with-homeostatic-rewards/