faul_sname comments on faul_sname’s Shortform

faul_sname 4 Mar 2025 11:23 UTC
3 points
0
Where does the gradient which chisels in the “care about the long term X over satisfying the homeostatic drives” behavior come from, if not from cases where caring about the long term X previously resulted in attributable reward? If it’s only relevant in rare cases, I expect the gradient to be pretty weak and correspondingly I don’t expect the behavior that gradient chisels in to be very sophisticated.
- Gurkenglas 4 Mar 2025 12:12 UTC
  3 points
  0
  Parent
  https://www.lesswrong.com/posts/roA83jDvq7F2epnHK/better-priors-as-a-safety-problem