Steven Byrnes comments on Evolution Solved Alignment (what sharp left turn?)

Steven Byrnes 16 Oct 2023 22:08 UTC
4 points
0
I don’t think that, in order for an algorithm to be RL, its reward function must by definition be a proxy for something else more complicated. For example, the RL reward function of AlphaZero is not an approximation of a more complex thing—the reward function is just “did you win the game or not?”, and winning the game is a complete and perfect description of what DeepMind programmers wanted the algorithm to do. And everyone agrees that AlphaZero is an RL algorithm, indeed a central example. Anyway, AlphaZero would be an RL algorithm regardless of the motivations of the DeepMind programmers, right?
- jacob_cannell 16 Oct 2023 22:49 UTC
  2 points
  0
  Parent
  True in games the environment itself can directly provide a reward channel, such that the perfect ‘proxy’ simplifies to the trivial identity connection on that channel. But that’s hardly an interesting case right? A human ultimately designed the reward channel for that engineered environment, often as a proxy for some human concept.
  
  The types of games/sims that are actually interesting for AGI, or even just general robots or self driving cars, are open ended where designing the correct reward function (as a proxy for true utility) is much of the challenge.