Current LLMs seem to be relatively easy to align by writing those kinds of specifications and mostly don’t try to do harmful things, at least not those of the frontier labs. I just think that soon after LLM-based AGI gets developed, one of the first tasks given to the LLM will probably be to develop novel more efficient AI architectures in order to reduce the high energy usage of current architectures. Because LLM-based AGI will probably consume even more energy than current LLMs.
And the LLM might not be as careful as safety researchers when it tries to find more efficient architectures, especially when pressured by humans to try test new approaches anyway despite the risks involved that the LLM might be aware of, but the human is focused more on the positive potential of the technology.
My guess is that it will end up with some approach that will use neuralese and not be language-based, because language is ambiguous, loses meaning and most importantly limits AIs thinking to concepts known to humans, which does not include all the possible concepts in the very vast “concept-space” of superintelligent understanding. And not only the concepts but also the very nature of human reasoning, which is most likely not the most effective way to find a solution to a given problem.
So basically at some point too high energy demands will pressure AI development to switch from language models to neuralese models, which are hard to align, let alone understand.
Except if the LLM is tasked with finding a breakthrough in fusion power, that might then let us sustain LLM training and inference.
Current LLMs seem to be relatively easy to align by writing those kinds of specifications and mostly don’t try to do harmful things, at least not those of the frontier labs. I just think that soon after LLM-based AGI gets developed, one of the first tasks given to the LLM will probably be to develop novel more efficient AI architectures in order to reduce the high energy usage of current architectures. Because LLM-based AGI will probably consume even more energy than current LLMs.
And the LLM might not be as careful as safety researchers when it tries to find more efficient architectures, especially when pressured by humans to try test new approaches anyway despite the risks involved that the LLM might be aware of, but the human is focused more on the positive potential of the technology.
My guess is that it will end up with some approach that will use neuralese and not be language-based, because language is ambiguous, loses meaning and most importantly limits AIs thinking to concepts known to humans, which does not include all the possible concepts in the very vast “concept-space” of superintelligent understanding. And not only the concepts but also the very nature of human reasoning, which is most likely not the most effective way to find a solution to a given problem.
So basically at some point too high energy demands will pressure AI development to switch from language models to neuralese models, which are hard to align, let alone understand.
Except if the LLM is tasked with finding a breakthrough in fusion power, that might then let us sustain LLM training and inference.