Algon comments on Daniel Kokotajlo’s Shortform

Algon 10 Jul 2025 17:49 UTC
2 points
0
But reasoning models don’t get reward during deployment. In what sense are they “optimizing for reward”?
- Daniel Kokotajlo 10 Jul 2025 19:04 UTC
  2 points
  0
  Parent
  See the discussion with Violet Hour elsethread.