Realmbird

Karma: 74

Realmbird 2 Jun 2026 1:32 UTC
1 point
0
in reply to: ovindu-a’s comment on: NLA Thought Anchors
For this experiment with NLAs for 7B level, I used 2 3090s
1 for the AV SG Lang and the otherr for (generating responses and AR)
Together with 40 rollouts per prompt in GSM8k, the total time was around 24 hours
It should be way faster if you did it for 7B on an A100.

Realmbird 1 Jun 2026 0:17 UTC
1 point
0
in reply to: Realmbird’s comment on: Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
For Qwen2.5-7B-Instruct’s NLAs I found evidence that NLA answer appearing in AV increases as the token approaches the model’s final answer.

NLA Thought Anchors

Realmbird31 May 2026 23:38 UTC

10 points

3 comments4 min readLW link

Realmbird 30 May 2026 18:36 UTC
1 point
0
in reply to: ryan_greenblatt’s comment on: Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
Token position like on final answer token vs border token. AV on final answer token shows final answer in AV at a higher rate, for the results for 27B which token?

NLA Verbalizations on AuditBench: Llama 70B

Realmbird16 May 2026 5:25 UTC

10 points

0 comments3 min readLW link

Realmbird 4 May 2026 15:14 UTC
1 point
0
in reply to: joseph_c’s comment on: MHC Interp #1: Previous-Token Heads Become Attention Sinks Under Manifold-Constrained Hyper-Connections
Cool idea

MHC Interp #1: Previous-Token Heads Become Attention Sinks Under Manifold-Constrained Hyper-Connections

Realmbird3 May 2026 11:06 UTC

21 points

2 comments5 min readLW link

Latent Reasoning Sprint #4: PCA Analysis on CoDI

Realmbird18 Apr 2026 21:25 UTC

7 points

0 comments3 min readLW link

Realmbird 18 Apr 2026 18:11 UTC
1 point
0
on: Can we interpret latent reasoning using current mechanistic interpretability tools?
With how CoDI throws away the hidden state and only uses the kv values on the <|eocot|> token the accuracy drop after latent 5 could just be kv values can’t store more info.

Latent Reasoning Sprint #3: Activation Difference Steering and Logit Lens

Realmbird4 Apr 2026 3:56 UTC

15 points

0 comments4 min readLW link

Latent Reasoning Sprint #2: Token-Based Signals and Linear Probes

Realmbird19 Mar 2026 3:39 UTC

6 points

0 comments3 min readLW link

Latent Reasoning Sprint #1: Tuned Lens and Logit Lens on CODI

Realmbird6 Mar 2026 18:36 UTC

7 points

1 comment4 min readLW link

Exploration of Counterfactual Importance and Attention Heads

Realmbird30 Sep 2025 1:17 UTC

13 points

0 comments6 min readLW link

Realmbird 18 Aug 2025 21:10 UTC
1 point
0
in reply to: Paul Bogdan’s comment on: Thought Anchors: Which LLM Reasoning Steps Matter?
How did you learn that vertical attention corresponded to sentences?