Daniel Kokotajlo comments on How Well Does RL Scale?

Daniel Kokotajlo 24 Oct 2025 17:19 UTC
6 points
0
“inference scaling as the main surviving form of scaling ” --> But it isn’t though, RL is still a very important form of scaling. Yes, it’ll become harder to scale up RL in the near future (recently they could just allocate more of their existing compute budget to RL, but soon they’ll need to grow their compute budget) so there’ll be a slowdown from that effect, but it seems to me that the next three OOMs of RL scaling will bring at least as much benefit as the previous three OOMs of RL scaling, which was substantial as you say (largely because it ‘unlocked’ more inference compute scaling. The next 3 OOMs of RL scaling will ‘unlock’ even more.)
Re: Willingness to pay going up: Yes, that’s what I expect. I don’t think it’s hard at all. If you do a bunch of RL scaling that ‘unlocks’ more inference scaling—e.g. by extending METR-measured horizon length—then boom, now your models can do significantly longer, more complex tasks than before. Those tasks are significantly more valuable and people will be willing to pay significantly more for them.