Tomek Korbak comments on Lessons from Studying Two-Hop Latent Reasoning

Tomek Korbak 12 Sep 2025 12:04 UTC
LW: 2 AF: 1
0
AF
Yeah, it seems plausible that entity being activated across different context is necessary for it being represented saliently enough to facilitate multi-hop reasoning. The Grokked transformer paper has some results linking the ratio of e2 and e3 to two-hop performance (in toy settings).
- J Bostock 13 Mar 2026 11:18 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Ok so I have actually, after six months, got round to running some experiments here along the lines of the blegg/rube case. It uses a matching game where I generate k sets of N_nodes elements like {country, colour, number, animal, …}. I then train a model on some of the directed edges like country → animal, etc. and evaluate on some others.
  Toy models and Pythia 70B can both learn to correctly match elements along unseen edges. (Weirdly Gemma 3 1B can’t do so when trained with LoRA, but Pythia 70B can, and I haven’t tried full Gemma 3 fine-tuning yet)
  The specific blegg/rube case doesn’t work so well, but using random sub-graphs is pretty successful (Pythia can learn a 6-element matching game from ¹²⁄₃₀ directed edges, toy models are a bit less stable). You can also see clusters forming in the residual stream corresponding to each set as abstractions form.
  Might or might not get round to posting it, depending on whether I get a job soon.
  - Tomek Korbak 13 Mar 2026 17:36 UTC
    LW: 2 AF: 1
    0
    AF Parent
    I’d be excited to read a write-up!
    - J Bostock 13 Mar 2026 19:14 UTC
      2 points
      0
      Parent
      I’d be excited to write it! If I do get a position I’m hopeful for, then I’ll see if I can write it up as part of the research there, and if I don’t get that position then I’ll be able to write it up in my free time I suppose. ¯\_(ツ)_/¯