One question I have regarding fast takeoff is: don’t you expect learning algorithms much more efficient than SGD to show up and accelerate a lot the rate of development of capabilities?
One “overhang’ I can see it the fact that humans have written a lot of what they know how to do all kinds of task on the internet and so a pretty data efficient algo could just leverage this and fairly suddenly learn a ton of tasks quite rapidly.
For instance, in context learning is way more data efficient than SGD in pre-training. Right now it doesn’t seem like in context learning is exploited nearly as much as it could be. If we manage to turn ~any SGD learning problem into an in-context learning problem, which IMO could happen with an efficient long term memory and a better long context length, things could accelerate pretty wildly. Do you think that even things like that (i.e. we unlock a more data efficient Algo which allows much faster capabilities development) will necessarily be smoothed?
don’t you expect learning algorithms much more efficient than SGD to show up and accelerate a lot the rate of development of capabilities?
Brains use somewhat less lifetime training compute (perhaps 0 to a few OOM less) than GPT4, and 2 or 3 OOM less data, which provides existence proof of somewhat better scaling curves, along with some evidence that scaling curves much better than those brains are on are probably hard.
AI systems already train on the entire internet so I don’t see how that is an overhang.
There are diminishing returns to context for in-context learning; it is extremely RAM intensive and GPUs are RAM starved compared to the brain, and finally brains already use it with much longer context, so its more like one of the hard challenges to achieve brain parity at all rather than a big overhang.
I am definitely semi-agnostic to whether SGD will ultimately be the base optimizer of choice, and whether the inner algorithm does better than SGD and causes a fast takeoff.
But I’ll assume that you are right about fast takeoff happening, and my response to that is that this would leave the alignment schemes proposed intact, for the following reasons:
Even if fast takeoff happens, the sharp left turn in the form of misgeneralization is still less likely to happen, because unlike evolution, we are unlikely to run fresh versions of an AI, and retain the same AI throughout the training run.
It mostly doesn’t affect how easy it is to learn values, and the trick of using our control of SGD to be the innate reward system still works, because of the fact that weak genetic priors that are easy to trick plus the innate reward system’s local update rule still suffices to make people reliably have a set of values like empathy for the ingroup.
SGD still has really strong corrective properties against inner misaligned agents, unlike evolution.
I do agree that fast takeoff complicates the analysis, but I don’t think it breaks the alignment methods shown in the post. If it required very strong priors to align (But with SGD we can align them to reward functions that are much more complicated than genetic priors can do), or we can’t control the innate reward system, this would be a much bigger issue.
I think there are plausible stories in which a hard left turn could happen (but as you’ve pointed out, it is extremely unlikely under the current deep learning paradigm).
For example, suppose it turns out that a class of algorithms I will simply call heuristic AIXI are much more powerful than the current deep learning paradigm.
The idea behind this class of algorithm is you basically do evolution but instead of using blind hillclimbing, you periodically ask what is the best learning algorithm I have, and then apply that to your entire process. Because this means you are constantly changing the learning algorithm, you could get the same sort of 1Mx overhang that caused the sharp left turn in human evolution.
The obvious counter is that if we think heuristic, AIXI is not safe, then we should just not use it. But the obvious counter to that is when have humans ever not done some thing because someone else told them it wasn’t safe.
I definitely agree with the claim that evolutionary strategies being effective would weaken my entire case. I do think that evolutionary methods like GAs are too hobbled by their inability to exploit white-box optimization, unlike SGD, but we shall see.
I genuinely don’t know if heuristic AIXI is a real thing or not, but if it is it combines the ability to search the whole space of possible algorithms (which evolution has but SGD doesn’t) with the ability to take advantage of higher order statistics (like SGD does but evolution doesn’t).
My best guess is that just as there was a “Deep learning” regime that only got unlocked once we had tons of compute from GPUs, there’s also a heuristic AIXI regime that unlocks at some level of compute.
Thanks a lot for writing that post.
One question I have regarding fast takeoff is: don’t you expect learning algorithms much more efficient than SGD to show up and accelerate a lot the rate of development of capabilities?
One “overhang’ I can see it the fact that humans have written a lot of what they know how to do all kinds of task on the internet and so a pretty data efficient algo could just leverage this and fairly suddenly learn a ton of tasks quite rapidly. For instance, in context learning is way more data efficient than SGD in pre-training. Right now it doesn’t seem like in context learning is exploited nearly as much as it could be. If we manage to turn ~any SGD learning problem into an in-context learning problem, which IMO could happen with an efficient long term memory and a better long context length, things could accelerate pretty wildly. Do you think that even things like that (i.e. we unlock a more data efficient Algo which allows much faster capabilities development) will necessarily be smoothed?
Brains use somewhat less lifetime training compute (perhaps 0 to a few OOM less) than GPT4, and 2 or 3 OOM less data, which provides existence proof of somewhat better scaling curves, along with some evidence that scaling curves much better than those brains are on are probably hard.
AI systems already train on the entire internet so I don’t see how that is an overhang.
There are diminishing returns to context for in-context learning; it is extremely RAM intensive and GPUs are RAM starved compared to the brain, and finally brains already use it with much longer context, so its more like one of the hard challenges to achieve brain parity at all rather than a big overhang.
I am definitely semi-agnostic to whether SGD will ultimately be the base optimizer of choice, and whether the inner algorithm does better than SGD and causes a fast takeoff.
But I’ll assume that you are right about fast takeoff happening, and my response to that is that this would leave the alignment schemes proposed intact, for the following reasons:
Even if fast takeoff happens, the sharp left turn in the form of misgeneralization is still less likely to happen, because unlike evolution, we are unlikely to run fresh versions of an AI, and retain the same AI throughout the training run.
It mostly doesn’t affect how easy it is to learn values, and the trick of using our control of SGD to be the innate reward system still works, because of the fact that weak genetic priors that are easy to trick plus the innate reward system’s local update rule still suffices to make people reliably have a set of values like empathy for the ingroup.
SGD still has really strong corrective properties against inner misaligned agents, unlike evolution.
I do agree that fast takeoff complicates the analysis, but I don’t think it breaks the alignment methods shown in the post. If it required very strong priors to align (But with SGD we can align them to reward functions that are much more complicated than genetic priors can do), or we can’t control the innate reward system, this would be a much bigger issue.
I think there are plausible stories in which a hard left turn could happen (but as you’ve pointed out, it is extremely unlikely under the current deep learning paradigm).
For example, suppose it turns out that a class of algorithms I will simply call heuristic AIXI are much more powerful than the current deep learning paradigm.
The idea behind this class of algorithm is you basically do evolution but instead of using blind hillclimbing, you periodically ask what is the best learning algorithm I have, and then apply that to your entire process. Because this means you are constantly changing the learning algorithm, you could get the same sort of 1Mx overhang that caused the sharp left turn in human evolution.
The obvious counter is that if we think heuristic, AIXI is not safe, then we should just not use it. But the obvious counter to that is when have humans ever not done some thing because someone else told them it wasn’t safe.
I definitely agree with the claim that evolutionary strategies being effective would weaken my entire case. I do think that evolutionary methods like GAs are too hobbled by their inability to exploit white-box optimization, unlike SGD, but we shall see.
I genuinely don’t know if heuristic AIXI is a real thing or not, but if it is it combines the ability to search the whole space of possible algorithms (which evolution has but SGD doesn’t) with the ability to take advantage of higher order statistics (like SGD does but evolution doesn’t).
My best guess is that just as there was a “Deep learning” regime that only got unlocked once we had tons of compute from GPUs, there’s also a heuristic AIXI regime that unlocks at some level of compute.