„or that training generalizes far beyond competition math and coding“ well, it does scale beyond math and coding, at least if you mean benchmark performance increasing of non stem fields with RL on math and/or coding. But if you mean setting up a good environment and rewards for RLVR that’s indeed hard, but there has been some interesting research. I think scaling will continue, whether it’s pre training or inference or RL. And I think funding will still flow, yes capabilities will need to get better and there’s no capability it would need to reach for funding to continue (maybe it does but nobody knows yet) but they are already (models getting better), whether they continue to do so is another question, since RL needs computer and data and how to do RL on non verifiable tasks is still in research. But I’m kind of optimistic we’ll get to very good capabilities with long context, continuous reasoning, tool calling and more agentic and vision stuff.
„or that training generalizes far beyond competition math and coding“ well, it does scale beyond math and coding, at least if you mean benchmark performance increasing of non stem fields with RL on math and/or coding. But if you mean setting up a good environment and rewards for RLVR that’s indeed hard, but there has been some interesting research. I think scaling will continue, whether it’s pre training or inference or RL. And I think funding will still flow, yes capabilities will need to get better and there’s no capability it would need to reach for funding to continue (maybe it does but nobody knows yet) but they are already (models getting better), whether they continue to do so is another question, since RL needs computer and data and how to do RL on non verifiable tasks is still in research. But I’m kind of optimistic we’ll get to very good capabilities with long context, continuous reasoning, tool calling and more agentic and vision stuff.