Yeah. Though strictly speaking, only something like self-play (mentioned by Karl Krueger below) is strictly speaking a model improving itself. The more classic example of RSI is a model acting as an AI researcher, working on better ML algorithms (optimizer, architecture, objective function, etc), which doesn’t directly improve the model (the AI ML researcher) itself, only ancestor models which are then trained using this improved ML algorithm. The latter form of RSI is nonetheless more powerful than something like self-play, which is stuck with its likely suboptimal architecture, meaning that it will eventually plateau. An automated ML researcher can theoretically improve up to technological maturity: the best ML algorithm possible.
Yeah. Though strictly speaking, only something like self-play (mentioned by Karl Krueger below) is strictly speaking a model improving itself. The more classic example of RSI is a model acting as an AI researcher, working on better ML algorithms (optimizer, architecture, objective function, etc), which doesn’t directly improve the model (the AI ML researcher) itself, only ancestor models which are then trained using this improved ML algorithm. The latter form of RSI is nonetheless more powerful than something like self-play, which is stuck with its likely suboptimal architecture, meaning that it will eventually plateau. An automated ML researcher can theoretically improve up to technological maturity: the best ML algorithm possible.