Good analysis this seems very likely to me as well. It seems good to think about the strategic reasoning behind this. I hadn’t previously deeply considered the dynamic of each new pretrain release being just slightly better than the previous pretrain + X RLVR, this does seem to imply that OpenAI is intentionally avoiding creating large jumps in capability with this iterative deployment strategy. Why is this important to them? The first idea I had that seems somewhat likely is that large capability jumps generate press and discourse in the mainstream (see mythos). This strategy would dodge that press, which if true seems like an important signal about their priorities re messaging and narrative (Conjecture: is it to their benefit for people to believe AI has “hit a wall”? Maybe they want to avoid the specific kind of “doomer” rhetoric that seems to arise whenever large capability jumps happen?). Looking into how the various labs are attempting to shape the narrative seems important, even if it inherently relies on a good bit of conjecture about internal motivations.
I think it is probably more a result of wanting to release a new, better model as often as possible. We’ve seen AI companies cluster releases together, last year, as if they’re rushing to put something out whenever anyone else does, so that the media frenzy isn’t exclusively about their competitor. A problem with that approach is you have to either hold something good back while waiting for competitors to release, or push something out prematurely. Everyone wants to get the last word, putting out the model that’ll be perceived as best for the next couple months or so. That’s tough to pull off, so an alternative is to just try and release as frequently as possible, rather than trying to time things cleverly. Then (if you keep pace capability-wise) your models will be the best most often, simply because you release more. The problem with that strategy is, you’ll be using more compute running multiple training runs (doing a big post-training run at the same time as pre-training the next big thing). A more focused approach with fewer parallel training runs and a slower release cycle can utilize limited compute more effectively. The main question (for the race) becomes, who is managing all these trade-offs most effectively?
I am as cynical about Sam Altman as anyone but he does constantly say that he believes iterative deployment is important so that the public can see the frontier of AI as it emerges and so that society can adapt to development. It seems plausible that he, and a lot of other people at OpenAI, do actually believe this.
Good analysis this seems very likely to me as well. It seems good to think about the strategic reasoning behind this. I hadn’t previously deeply considered the dynamic of each new pretrain release being just slightly better than the previous pretrain + X RLVR, this does seem to imply that OpenAI is intentionally avoiding creating large jumps in capability with this iterative deployment strategy. Why is this important to them? The first idea I had that seems somewhat likely is that large capability jumps generate press and discourse in the mainstream (see mythos). This strategy would dodge that press, which if true seems like an important signal about their priorities re messaging and narrative (Conjecture: is it to their benefit for people to believe AI has “hit a wall”? Maybe they want to avoid the specific kind of “doomer” rhetoric that seems to arise whenever large capability jumps happen?). Looking into how the various labs are attempting to shape the narrative seems important, even if it inherently relies on a good bit of conjecture about internal motivations.
I think it is probably more a result of wanting to release a new, better model as often as possible. We’ve seen AI companies cluster releases together, last year, as if they’re rushing to put something out whenever anyone else does, so that the media frenzy isn’t exclusively about their competitor. A problem with that approach is you have to either hold something good back while waiting for competitors to release, or push something out prematurely. Everyone wants to get the last word, putting out the model that’ll be perceived as best for the next couple months or so. That’s tough to pull off, so an alternative is to just try and release as frequently as possible, rather than trying to time things cleverly. Then (if you keep pace capability-wise) your models will be the best most often, simply because you release more. The problem with that strategy is, you’ll be using more compute running multiple training runs (doing a big post-training run at the same time as pre-training the next big thing). A more focused approach with fewer parallel training runs and a slower release cycle can utilize limited compute more effectively. The main question (for the race) becomes, who is managing all these trade-offs most effectively?
I am as cynical about Sam Altman as anyone but he does constantly say that he believes iterative deployment is important so that the public can see the frontier of AI as it emerges and so that society can adapt to development. It seems plausible that he, and a lot of other people at OpenAI, do actually believe this.