Brendan Long comments on Brendan Long’s Shortform

Brendan Long 9 May 2026 16:09 UTC
3 points
0
Yeah I don’t know how far this will generalize, but the fact that it learns this association at all is a very good sign. My default expectation is for LLMs to learn things in a disconnected way (like how “France’s capital is Paris” and “Paris is in France” are completely different circuits) and this is evidence against that in an alignment-relevant situation.