I think open-loop vs closed-loop is an orthogonal dimension than RSI vs not-RSI.
open-loop I-RSI: “AI plays a game and thinks about which reasoning heuristics resulted in high reward and then decides to use these heuristics more readily”
open-loop E-RSI: “AI implements different agent scaffolds for itself, and benchmarks them using FrontierMath”
open-loop not-RSI: “humans implement different agent scaffolds for itself, and benchmarks them using FrontierMath”
closed-loop I-RSI: “AI plays a game and thinks about which reasoning heuristics could’ve shortcut the decision”
closed-loop E-RSI: “humans implement different agent scaffolds for the AI, and benchmarks them using FrontierMath”
I think open-loop vs closed-loop is an orthogonal dimension than RSI vs not-RSI.
open-loop I-RSI: “AI plays a game and thinks about which reasoning heuristics resulted in high reward and then decides to use these heuristics more readily”
open-loop E-RSI: “AI implements different agent scaffolds for itself, and benchmarks them using FrontierMath”
open-loop not-RSI: “humans implement different agent scaffolds for itself, and benchmarks them using FrontierMath”
closed-loop I-RSI: “AI plays a game and thinks about which reasoning heuristics could’ve shortcut the decision”
closed-loop E-RSI: “humans implement different agent scaffolds for the AI, and benchmarks them using FrontierMath”
closed-loop not-RSI: constitutionalAI, AlphaGo