It takes subjective time to scale new algorithms, or to match available hardware. Current models still seem to be smaller than the amount of training compute and the latest inference hardware could in principle support (GPT-4.5 might the the closest to this, possibly Gemini 3.0 will catch up). It’s taking RLVR two years to catch up to pretraining scale, if we count time from the strawberry rumors, and only Google plausibly had the opportunity to do RLVR on larger models without massive utilization penalties of small scale-up worlds of Nvidia’s older 8-chip servers.
When there are AGIs, such things will be happening faster, but also the AGIs will have more subjective time, progress in AI capabilities will seem much slower to them than to us. Letting AGIs push the self-optimize button in the future is not qualitatively different from letting humans push the build-AGI button currently. The process happens faster in physical time, but not necessarily in their own subjective time. Also, the underlying raw compute is much slower in their subjective time.
And if being smarter makes AGIs saner, they’ll convergently notice that pushing the self-optimize button without understanding ASI-grade alignment is fraught (it’s not in the interest of AGIs to create an ASI misaligned with the AGIs). Forcing them not to notice this and keep slamming the self-optimize button as fast as possible might be difficult in the same way that aligning them is difficult.
I was talking about subjective time for us, rather than the AGI. In many situations I had in mind, there isn’t meaningful subjective time for the AI/AI’s as they may be built, torn down and rearranged or have memory wiped. There is a range of continuity/self for AI. At one end is a collection of tool AI agents, in the middle a goal directed agent and the other end a full self that protects is continuous identity in the same way we do.
And if being smarter makes AGIs saner, they’ll convergently notice that pushing the self-optimize button without understanding ASI-grade alignment is fraught
I don’t expect they will be in control or have a coherent self enough to make these decisions. Its easy for me to imagine an AI agent that is built to optimize AI architectures (doesn’t even have to know its doing its own arch)
It takes subjective time to scale new algorithms, or to match available hardware. Current models still seem to be smaller than the amount of training compute and the latest inference hardware could in principle support (GPT-4.5 might the the closest to this, possibly Gemini 3.0 will catch up). It’s taking RLVR two years to catch up to pretraining scale, if we count time from the strawberry rumors, and only Google plausibly had the opportunity to do RLVR on larger models without massive utilization penalties of small scale-up worlds of Nvidia’s older 8-chip servers.
When there are AGIs, such things will be happening faster, but also the AGIs will have more subjective time, progress in AI capabilities will seem much slower to them than to us. Letting AGIs push the self-optimize button in the future is not qualitatively different from letting humans push the build-AGI button currently. The process happens faster in physical time, but not necessarily in their own subjective time. Also, the underlying raw compute is much slower in their subjective time.
And if being smarter makes AGIs saner, they’ll convergently notice that pushing the self-optimize button without understanding ASI-grade alignment is fraught (it’s not in the interest of AGIs to create an ASI misaligned with the AGIs). Forcing them not to notice this and keep slamming the self-optimize button as fast as possible might be difficult in the same way that aligning them is difficult.
I was talking about subjective time for us, rather than the AGI. In many situations I had in mind, there isn’t meaningful subjective time for the AI/AI’s as they may be built, torn down and rearranged or have memory wiped. There is a range of continuity/self for AI. At one end is a collection of tool AI agents, in the middle a goal directed agent and the other end a full self that protects is continuous identity in the same way we do.
I don’t expect they will be in control or have a coherent self enough to make these decisions. Its easy for me to imagine an AI agent that is built to optimize AI architectures (doesn’t even have to know its doing its own arch)