Daniel Kokotajlo comments on The behavioral selection model for predicting AI motivations

Daniel Kokotajlo 4 Dec 2025 22:19 UTC
LW: 33 AF: 15
24
AF
This is really great, thank you! It feels like it’s a one-stop-shop for a lot of the most important ideas and arguments that have been developed on the topic of deep learning misalignment over the past few years.
- Daniel Kokotajlo 19 Mar 2026 18:34 UTC
  LW: 4 AF: 4
  0
  AF Parent
  Possibly relevant empirical evidence has arrived!
  Also this one here of CoT samples!