I thought for some time that we would just scale up models and once we reached enough parameters we’d get an AI with a more precise and comprehensive world-model than humans, at which point the AI would be a more advanced general reasoner than humans.
But it seems that we’ve stopped scaling up models in terms of parameters and are instead scaling up RL post-training. Does RL sidestep the need for surpassing (equivalently) the human brain’s neurons and neural connections? Or by scaling up RL on these sub-human (in the sense described) models necessarily just lead to models which are only superhuman in narrow domains, but which are worse general reasoners?
I recognise my ideas here are not well-developed, hoping someone will help steer my thinking in the right direction.
I thought for some time that we would just scale up models and once we reached enough parameters we’d get an AI with a more precise and comprehensive world-model than humans, at which point the AI would be a more advanced general reasoner than humans.
But it seems that we’ve stopped scaling up models in terms of parameters and are instead scaling up RL post-training. Does RL sidestep the need for surpassing (equivalently) the human brain’s neurons and neural connections? Or by scaling up RL on these sub-human (in the sense described) models necessarily just lead to models which are only superhuman in narrow domains, but which are worse general reasoners?
I recognise my ideas here are not well-developed, hoping someone will help steer my thinking in the right direction.