Jeffrey Liang comments on The Croissant Principle: A Theory of AI Generalization

Jeffrey Liang 23 Jun 2025 15:26 UTC
1 point
0
Thanks interesting! I had not read this paper before.
Some initial thoughts:
1. Very cool and satisfying that all these scaling laws might emerge from metric space geometry (i.e. dimensionality).
2. Main differences seem to be: they tackle model scaling, their data manifold is a product of the model while our latent space is a property of the data and its generating process itself, and they provide empirical evidence.
3. They note that model scaling seems to be pretty independent of architecture. I wonder if the relevant model scaling law in most cases is more similar to our model where it’s a property of the data before being processed by the model.
4. I might get around to running empirical experiments for this, though I’m pretty busy trying out all my other ideas heh. Would definitely welcome work from others on this! The way I was thinking about testing this was to set up a synthetic regression dataset where you explicitly generate data from a latent space and see how loss scales as you increase data.