I’d have to look back at the methodology to be sure, but on the assumption that they have the model answer immediately without any chain of thought, my default guess about this is that it’s about the limits of what can be done in one or a small number of forward passes. If it’s the case that the model is doing some sort of internal simulation of its own behavior on the task, that seems like it might require more steps of computation than just a couple of forward passes allow. Intuitively, at least, this sort of internal simulation is what I imagine is happening when humans do introspection on hypothetical situations.
If on the other hand the model is using some other approach, maybe circuitry developed for this sort of purpose, then I would expect that maybe that approach can only handle pretty simple problems, since it has to be much smaller than the circuitry developed for actually handling a very wide range of tasks, ie the rest of the model.
I’d have to look back at the methodology to be sure, but on the assumption that they have the model answer immediately without any chain of thought, my default guess about this is that it’s about the limits of what can be done in one or a small number of forward passes. If it’s the case that the model is doing some sort of internal simulation of its own behavior on the task, that seems like it might require more steps of computation than just a couple of forward passes allow. Intuitively, at least, this sort of internal simulation is what I imagine is happening when humans do introspection on hypothetical situations.
If on the other hand the model is using some other approach, maybe circuitry developed for this sort of purpose, then I would expect that maybe that approach can only handle pretty simple problems, since it has to be much smaller than the circuitry developed for actually handling a very wide range of tasks, ie the rest of the model.