Brendan Long comments on Filler tokens don’t allow sequential reasoning

Brendan Long 14 Dec 2025 17:26 UTC
3 points
0
Yeah, I think the architecture makes this tricky for LLMs in one step since the layers that process multi-step reasoning have to be in the right order: “Who is Obama’s wife?” has to be in earlier layer(s) than “When was Michelle Obama born?”. With CoT they both have to be in there but it doesn’t matter where.