Thanks! Are you saying there is a better way to find citations than a random walk through the literature? :)
I didn’t realize that the pictures above limit to literal pieces of sin and cos curves (and Lissajous curves more generally). I suspect this is a statement about the singular values of the “sum” matrix S of upper-triangular 1′s?
The “developmental clock” observation is neat! Never heard of it before. Is it a qualitative “parametrization of progress” thing or are there phase transition phenomena that happen specifically around the midpoint?
Hehe. Yes that’s right, in the limit you can just analyse the singular values and vectors by hand, it’s nice.
No general implied connection to phase transitions, but the conjecture is that if there are phase transitions in your development then you can for general reasons expect PCA to “attempt” to use the implicit “coordinates” provided by the Lissajous curves (i.e. a binary tree, the first Lissajous curve uses PC2 to split the PC1 range into half, and so on) to locate stages within the overall development. I got some way towards proving that by extending the literature I cited in the parent, but had to move on, so take the story with a grain of salt. This seems to make sense empirically in some cases (e.g. our paper).
One of the talks at ILIAD had a set for PCA plots where the PC2 turned around at different points for different training setups. I think the turning point corresponded to when the model started to overfit. I don’t quite remember. But what ever the meaning of the turning point was, I think they also verified this with some other observation. Given that this was ILIAD the other observation was probably the LLC.
If you want to look it up I can try to find the talk among the recordings.
Thanks! Are you saying there is a better way to find citations than a random walk through the literature? :)
I didn’t realize that the pictures above limit to literal pieces of sin and cos curves (and Lissajous curves more generally). I suspect this is a statement about the singular values of the “sum” matrix S of upper-triangular 1′s?
The “developmental clock” observation is neat! Never heard of it before. Is it a qualitative “parametrization of progress” thing or are there phase transition phenomena that happen specifically around the midpoint?
Hehe. Yes that’s right, in the limit you can just analyse the singular values and vectors by hand, it’s nice.
No general implied connection to phase transitions, but the conjecture is that if there are phase transitions in your development then you can for general reasons expect PCA to “attempt” to use the implicit “coordinates” provided by the Lissajous curves (i.e. a binary tree, the first Lissajous curve uses PC2 to split the PC1 range into half, and so on) to locate stages within the overall development. I got some way towards proving that by extending the literature I cited in the parent, but had to move on, so take the story with a grain of salt. This seems to make sense empirically in some cases (e.g. our paper).
One of the talks at ILIAD had a set for PCA plots where the PC2 turned around at different points for different training setups. I think the turning point corresponded to when the model started to overfit. I don’t quite remember. But what ever the meaning of the turning point was, I think they also verified this with some other observation. Given that this was ILIAD the other observation was probably the LLC.
If you want to look it up I can try to find the talk among the recordings.
The paper you’re thinking of is probably The Developmental Landscape of In-Context Learning.
It looks related, but these are not the plots I remember from the talk.