I think you’d want to set the limit to something slightly faster than Moore’s law. Otherwise you have a constant large compute overhang.
Ultimately, we’re going to be limited by Moore’s law (or its successor) growth rates eventually anyway. We’re on a kind of z-curve right now, where we’re transitioning from ML compute being some small constant fraction of all compute to some much larger constant fraction of all compute. Before the transition it grows at the same speed as compute in general. After the transition it also grows at the same speed as compute in general. In the middle it grows faster as we rush to spend a much larger share of GWP on it.
From that perspective, Moore’s law growth is the minimum growth rate you might have (unless annual spend on ML shrinks). And the question is just whether you transition from the small constant fraction of all compute to the large constant fraction of all compute slowly or quickly.
Trying to not do the transition at all (i.e. trying to growing at exactly the same rate as compute in general) seems potentially risky, because the resulting constant compute overhang means it’s relatively easy for someone somewhere to rush ahead locally and build something much better than SOTA.
If on the other hand, you say full steam ahead and don’t try to slow the transition at all, then on the plus side the compute overhang goes away, but on the minus side, you might rush into dangerous and destabilizing capabilities.
Perhaps a middle path makes sense, where you slow the growth rate down from current levels, but also slowly close the compute overhang gap over time.
Moore’s law is a doubling every 2 years, while this proposes doubling every 18 months, so pretty much what you suggest (not sure if you were disagreeing tbh but seemed like you might be?)
Otherwise you have a constant large compute overhang.
I think we should strongly consider finding a way of dealing with that rather than only looking at solutions that produce no overhang. For all we know, total compute required for TAI (especially factoring in future algorithmic progress) isn’t far away from where we are now. Dealing with the problem of preventing defectors from exploiting a compute overhang seems potentially easier than solving alignment on a very short timescale.
I suppose a possible mistake in this analysis is that I’m treating Moore’s law as the limit on compute growth rates, and this may not hold once we have stronger AIs helping to design and fabricate chips.
Even so, I think there’s something to be said for trying to slowly close the compute overhang gap over time.
I think you’d want to set the limit to something slightly faster than Moore’s law. Otherwise you have a constant large compute overhang.
Ultimately, we’re going to be limited by Moore’s law (or its successor) growth rates eventually anyway. We’re on a kind of z-curve right now, where we’re transitioning from ML compute being some small constant fraction of all compute to some much larger constant fraction of all compute. Before the transition it grows at the same speed as compute in general. After the transition it also grows at the same speed as compute in general. In the middle it grows faster as we rush to spend a much larger share of GWP on it.
From that perspective, Moore’s law growth is the minimum growth rate you might have (unless annual spend on ML shrinks). And the question is just whether you transition from the small constant fraction of all compute to the large constant fraction of all compute slowly or quickly.
Trying to not do the transition at all (i.e. trying to growing at exactly the same rate as compute in general) seems potentially risky, because the resulting constant compute overhang means it’s relatively easy for someone somewhere to rush ahead locally and build something much better than SOTA.
If on the other hand, you say full steam ahead and don’t try to slow the transition at all, then on the plus side the compute overhang goes away, but on the minus side, you might rush into dangerous and destabilizing capabilities.
Perhaps a middle path makes sense, where you slow the growth rate down from current levels, but also slowly close the compute overhang gap over time.
Moore’s law is a doubling every 2 years, while this proposes doubling every 18 months, so pretty much what you suggest (not sure if you were disagreeing tbh but seemed like you might be?)
Ah, good point!
I think we should strongly consider finding a way of dealing with that rather than only looking at solutions that produce no overhang. For all we know, total compute required for TAI (especially factoring in future algorithmic progress) isn’t far away from where we are now. Dealing with the problem of preventing defectors from exploiting a compute overhang seems potentially easier than solving alignment on a very short timescale.
I suppose a possible mistake in this analysis is that I’m treating Moore’s law as the limit on compute growth rates, and this may not hold once we have stronger AIs helping to design and fabricate chips.
Even so, I think there’s something to be said for trying to slowly close the compute overhang gap over time.