Thane Ruthenis comments on Inner Misalignment in “Simulator” LLMs

Thane Ruthenis 2 Feb 2023 10:00 UTC
4 points
1
Fair enough, I don’t disagree that it’s how current LLMs likely work.
I maintain, however, that it makes me very skeptical that their architecture is AGI-complete. In particular, I expect it’s incapable of supporting the sort of high-fidelity simulations that people often talk about in the context of e. g. accelerating alignment research. And that, on the contrary, the architectures that are powerful enough would be different enough to support search and therefore carry the dangers of inner misalignment.
I can sort of see the alternate picture, though, where the shallow patterns they implement include some sort of general-enough planning heuristics that’d theoretically let them make genuinely novel inferences over enough steps. I think that’d run into severe inefficiencies… but my intuition on that is a bit difficult to unpack.
Hm. Do you think the current LLM architectures are AGI-complete, if you scale them enough? If yes, how do you imagine they’d be carrying out novel inferences, mechanically? Inferences that require making use of novel abstractions?