This sounds to me like it’s assuming that if you keep scaling LLMs then you’ll eventually get to superintelligence. So I thought something like “hmm MIRI seems to assume that we’ll go from LLMs to superintelligence but LLMs seem much easier to align than the AIs in MIRI’s classic scenarios and also work to scale them will probably slow down eventually so that will also give us more time.
Yes I can see that is a downside, if LLM can’t scale enough to speed up alignment research and are not the path to AGI then having them aligned doesn’t really help.
My takeaway from Jacobs work and my beliefs is that you can’t separate hardware and computational topology from capabilities. That is if you want a system to understand and manipulate a 3d world the way humans and other smart animals do, then you need a large number of synapses, specifically in something like a scale free network like design. That means its not just bandwidth or TEPS, but also many long distance connections with only a small number of hops needed between any given neurons. Our current HW is not setup to simulate this very well, and a single GPU while having high FLOPS can’t get anywhere near high enough on this measure to match a human brain. Additionally you need a certain network size before the better architecture even gives an advantage. Transformers don’t beat CNN on vision tasks until the task reaches a certain difficulty. These combined lead me to believe that someone with just a GPU or two won’t do anything dangerous with a new paradigm.
Based on this, the observation that computers are already superhuman in some domains isn’t necessarily a sign of danger—the network required to play Go simply doesn’t need the large connected architecture because the domain, i.e. a small discrete 2d board doesn’t require it.
I agree that there is danger, and a crux to me is how much better can a ANN be at say science than a biological one given that we have not evolved to do abstract symbol manipulation. One one hand there are brilliant mathematicians that can outcompete everyone else, however the same does not apply to biology. Some stuff requires calculation and real world experimentation and intelligence can’t shortcut it.
If some problems require computation with specific topology/hardware then a GPU setup cant just reconfigure itself and FOOM.
Yes I can see that is a downside, if LLM can’t scale enough to speed up alignment research and are not the path to AGI then having them aligned doesn’t really help.
My takeaway from Jacobs work and my beliefs is that you can’t separate hardware and computational topology from capabilities. That is if you want a system to understand and manipulate a 3d world the way humans and other smart animals do, then you need a large number of synapses, specifically in something like a scale free network like design. That means its not just bandwidth or TEPS, but also many long distance connections with only a small number of hops needed between any given neurons. Our current HW is not setup to simulate this very well, and a single GPU while having high FLOPS can’t get anywhere near high enough on this measure to match a human brain. Additionally you need a certain network size before the better architecture even gives an advantage. Transformers don’t beat CNN on vision tasks until the task reaches a certain difficulty. These combined lead me to believe that someone with just a GPU or two won’t do anything dangerous with a new paradigm.
Based on this, the observation that computers are already superhuman in some domains isn’t necessarily a sign of danger—the network required to play Go simply doesn’t need the large connected architecture because the domain, i.e. a small discrete 2d board doesn’t require it.
I agree that there is danger, and a crux to me is how much better can a ANN be at say science than a biological one given that we have not evolved to do abstract symbol manipulation. One one hand there are brilliant mathematicians that can outcompete everyone else, however the same does not apply to biology. Some stuff requires calculation and real world experimentation and intelligence can’t shortcut it.
If some problems require computation with specific topology/hardware then a GPU setup cant just reconfigure itself and FOOM.