I think LLMs have a high proportion of fast/shallow/memorizing circuits while humans have a high proportion of slow/deep/generalizing circuits. Increasing circuit depth and generalizing in transformers requires a kind of phase change (which is actually very similar to ones in physics/chemistry, it’s not an inappropriate metaphor) and I’d be slightly shocked if there wasn’t an analogous process in human brains (though possibly more continuous). It seems like transformers are worse at this phase change than human brains are, so they disproportionately leverage large amounts of data and memorization. This, for me, explains most of the important differences between LLMs and people.
I think LLMs have a high proportion of fast/shallow/memorizing circuits while humans have a high proportion of slow/deep/generalizing circuits. Increasing circuit depth and generalizing in transformers requires a kind of phase change (which is actually very similar to ones in physics/chemistry, it’s not an inappropriate metaphor) and I’d be slightly shocked if there wasn’t an analogous process in human brains (though possibly more continuous). It seems like transformers are worse at this phase change than human brains are, so they disproportionately leverage large amounts of data and memorization. This, for me, explains most of the important differences between LLMs and people.