Fabien Roger comments on Lessons from Studying Two-Hop Latent Reasoning

Fabien Roger 14 Sep 2025 16:44 UTC
LW: 4 AF: 4
1
AF
Do you have an explanation for there is a bridge-entity representation mismatch for synthetic facts but not for real ones? What in “real” training allows LLMs to learn a common representation of input and output entities? Can you emulate that with additional fine-tuning on more synthetic documents?
- Tomek Korbak 15 Sep 2025 9:12 UTC
  LW: 3 AF: 2
  0
  AF Parent
  We don’t have a good explanation. One idea could be that you need bridge entities to be somehow more internalized to support latent two-hop reasoning, e.g. they need to occur in many facts as first and as second entities or maybe they need to occur in other two-hop questions. The Grokked transformers paper has some results linking the ratio of e2 and e3 to two-hop performance (in toy grokking settings).