It is often said that control is for safely getting useful work out of early powerful AI, not making arbitrarily powerful AI safe.
If it turns out large, rapid, local capabilities increase is possible, the leading developer could still opt to spend some inference compute on safety research rather than all on capabilities research.
I agree that some inference compute can be shifted from capabilities to safety, and it work just as well even during a software intelligence explosion.
My worry was more so that a lot of the control agenda and threat models like rogue internal deployments to get more compute would be fundamentally threatened if the assumption that you had to get more hardware compute for more power was wrong, and instead a software intelligence explosion could be done that used in principle fixed computing power, meaning catastrophic actions to disempower humanity/defeat control defenses were much easier for the model.
I’m not saying control is automatically doomed even under FOOM/software intelligence explosion, but I wanted to make sure that the assumption of FOOM being true didn’t break a lot of control techniques/defenses/hopes.
It is often said that control is for safely getting useful work out of early powerful AI, not making arbitrarily powerful AI safe.
If it turns out large, rapid, local capabilities increase is possible, the leading developer could still opt to spend some inference compute on safety research rather than all on capabilities research.
I agree that some inference compute can be shifted from capabilities to safety, and it work just as well even during a software intelligence explosion.
My worry was more so that a lot of the control agenda and threat models like rogue internal deployments to get more compute would be fundamentally threatened if the assumption that you had to get more hardware compute for more power was wrong, and instead a software intelligence explosion could be done that used in principle fixed computing power, meaning catastrophic actions to disempower humanity/defeat control defenses were much easier for the model.
I’m not saying control is automatically doomed even under FOOM/software intelligence explosion, but I wanted to make sure that the assumption of FOOM being true didn’t break a lot of control techniques/defenses/hopes.