I basically agree with the descriptive model, but don’t see that the conclusions follow.
For example:
The token bottleneck is real.
Sure, and so are limits like short term memory for humans. Doesn’t stop us.
And the same applies to only using shallow heuristics—humans mostly do the same thing.
Maybe, but I’d guess that’s a difference of less than an order of magnitude—and it seems like the relevant question isn’t only bits passed between circuits, since LLMs, even without reasoning, are autoregressive, so they can reason sequentially over multiple tokens. (And with reasoning, that’s obviously even more true.)
To the extent that LLM agents are agents, they definitely do this too! And if we’re talking about single-forward-pass reasoning, very few humans intentionally train their system 1 to do something better than semi-randomly follow patterns that worked before. (If you don’t know what I’m referring to, see the discussion of firefighters not actually making decisions and the resolution of the debate about system 1 / system 2 in Thinking Fast and Slow.)