Knight Lee comments on Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and open challenges.

Knight Lee 14 Jan 2025 8:21 UTC
3 points
0
Maybe one concrete implementation would be, when doing RL^[1] on an AI like o3, they don’t give it a single math question to solve. Instead, they give it like 5 quite different tasks, and the AI has to allocate its time to work on the 5 tasks.
I know this sounds like a small boring idea, but it might actually help if you really think about it! It might cause the resulting agent’s default behaviour pattern to be “optimize multiple tasks at once” rather than “optimize a single task ignoring everything else.” It might be the key piece of RL behind the behaviour of “whoa I already optimized this goal very thoroughly, it’s time I start caring about something else,” and this might actually be the behaviour that saves humanity.
1. ^
  RL = reinforcement learning