The chimp-human boundary goes from useless for going faster than evolution to eminently useful. But LLMs can talk and solve IMO problems, while chimps can’t, so I wouldn’t count on LLMs not already being beyond this boundary. LLMs merely need to somehow become an engine of a closed loop that works towards stronger cognitive capabilities, without necessarily themselves possessing such capabilities, or even broad human-level capabilities. Evolution is too slow to usefully do this within modern compute, but some LLM-juggling process could be much faster. And humans, when not part of the closed loop of human culture and civilization, remain as useless as chimps in reaching for superintelligence.
(RLVR is clearly deficient in the jaggedness of its results in practice, but that’s plausibly a problem of RLVR training data not being bitter-pilled. And conceptual invention might need many steps of using RLVR-trained reasoning to formulate new RLVR tasks for training the next step. So automation of generation of training data for RLVR, and of its application in training, might compensate for these issues well enough.)
The chimp-human boundary goes from useless for going faster than evolution to eminently useful. But LLMs can talk and solve IMO problems, while chimps can’t, so I wouldn’t count on LLMs not already being beyond this boundary. LLMs merely need to somehow become an engine of a closed loop that works towards stronger cognitive capabilities, without necessarily themselves possessing such capabilities, or even broad human-level capabilities. Evolution is too slow to usefully do this within modern compute, but some LLM-juggling process could be much faster. And humans, when not part of the closed loop of human culture and civilization, remain as useless as chimps in reaching for superintelligence.
(RLVR is clearly deficient in the jaggedness of its results in practice, but that’s plausibly a problem of RLVR training data not being bitter-pilled. And conceptual invention might need many steps of using RLVR-trained reasoning to formulate new RLVR tasks for training the next step. So automation of generation of training data for RLVR, and of its application in training, might compensate for these issues well enough.)