Gunnar_Zarncke comments on When is Goodhart catastrophic?

Gunnar_Zarncke 9 May 2023 22:19 UTC
6 points
−10
I wonder if the brainstem is limiting optimization is some way like this. So far my assumption was that the brainstem uses some saturation and temporal decay for the multiple reward components to prevent Goodhardting. But maybe something closer to the t-limiting here.