habryka comments on The behavioral selection model for predicting AI motivations

habryka 11 Dec 2025 19:27 UTC
LW: 8 AF: 6
4
AF
Promoted to curated: I do think this post summarizes one of the basic intuition generators for predicting AI motivations. It’s missing a lot of important stuff, especially as AIs get more competent (in-particular it doesn’t cover reflection, which I expect to be among the primary dynamics shaping motivations of powerful AIs), but it’s still quite helpful to be written up in more explicit form.
- Robert Shuler 12 Dec 2025 19:25 UTC
  1 point
  0
  Parent
  Is this close to what you mean by reflection? … once a system can represent its own objective formation, selection on behavior becomes selection on the process that builds behavior. Have you seen a way to formulate it? Can you differentiate it from the problems Godel and Turing discussed? Thanks, -RS