By this standard, does all self-play count as “recursive self-improvement”? Or only self-play where the two players’ roles are different, or not entirely rivalrous (like author and critic) rather than the same and purely rivalrous (like white and black in Go)?
Self-play is the foundational element of recursive self-improvement — it’s what allowed AI systems to bootstrap to superhuman performance in Chess and Go through effectively unbounded iteration. That said, there’s likely a meaningful spectrum within self-play that depends on the independence and algorithmic sophistication of the evaluating role.
Not all self-play is created equal in this regard. Purely rivalrous, symmetric self-play (like white versus black in Go) represents one end of that spectrum, while asymmetric arrangements with structurally distinct roles (like author and critic) may occupy a qualitatively different position — particularly when the critic brings independent evaluative criteria rather than simply optimizing against the same objective from the opposing side. The degree to which the feedback signal is decoupled from the generative process likely matters a great deal for how “recursive” the improvement truly feels.
By this standard, does all self-play count as “recursive self-improvement”? Or only self-play where the two players’ roles are different, or not entirely rivalrous (like author and critic) rather than the same and purely rivalrous (like white and black in Go)?
Self-play is the foundational element of recursive self-improvement — it’s what allowed AI systems to bootstrap to superhuman performance in Chess and Go through effectively unbounded iteration. That said, there’s likely a meaningful spectrum within self-play that depends on the independence and algorithmic sophistication of the evaluating role.
Not all self-play is created equal in this regard. Purely rivalrous, symmetric self-play (like white versus black in Go) represents one end of that spectrum, while asymmetric arrangements with structurally distinct roles (like author and critic) may occupy a qualitatively different position — particularly when the critic brings independent evaluative criteria rather than simply optimizing against the same objective from the opposing side. The degree to which the feedback signal is decoupled from the generative process likely matters a great deal for how “recursive” the improvement truly feels.