How did you learn that vertical attention corresponded to sentences?
Realmbird
Karma: 39
- Realmbird 18 Aug 2025 21:10 UTC1 point0in reply to: Paul Bogdan’s comment on: Thought Anchors: Which LLM Reasoning Steps Matter?
How did you learn that vertical attention corresponded to sentences?
With how CoDI throws away the hidden state and only uses the kv values on the <|eocot|> token the accuracy drop after latent 5 could just be kv values can’t store more info.