ghost-in-the-weights comments on bfinn’s Shortform

ghost-in-the-weights 19 Dec 2025 21:00 UTC
1 point
0
Yes, that’s what I’m implying. We don’t have super concrete evidence for it, but a large contingent of researchers (including myself) believes things like that are happening.
To understand why neural networks might do this, you can view them as a sort of Solomonoff induction. The neural network is a “program” (similar to a python program written in code) that has to model its data as well as possible, but the model is only so big, which means that the program is limited in length. Thinking about how you might write a program to generate images, it would be much more code-efficient to write abstractions and rules for things like geometry than it would be to enumerate every possibility. The training/optimization algorithm figures that out also.
You might also be interested in emergent world representations [1] (models simulate the underlying processes or “world” that generates their data in order to predict it) and the platonic representation hypothesis [2] (different models trained on seperate modalities like text and images form similar representations, meaning that an image model will represent a picture of a dog in a similar way to how a text model will represent the word “dog”).
1: https://arxiv.org/abs/2210.13382
2: https://arxiv.org/abs/2405.07987