Dmitry Vaintrob comments on Renormalization Redux: QFT Techniques for AI Interpretability

Dmitry Vaintrob 18 Jan 2025 10:55 UTC
8 points
1
Thanks for the questions!
1. Yes, “QFT” stands for “Statistical field theory” :). We thought that this would be more recognizable to people (and also, at least to some extent, statistical is a special case of quantum). We aren’t making any quantum proposals.
2. 1. We’re following (part of) this community, and interested in understanding and connecting the different parts better. Most papers in the “reference class” we have looked at come from (a variant of) this approach. (The authors usually don’t assume Gaussian inputs or outputs, but just high width compared to depth and number of datapoints—this does make them “NTK-like”, or at least perturbatively Gaussian, in a suitable sense).
  2. Neither of us thinks that you should think of AI as being in this regime. One of the key issues here is that Gaussian models can not model any regularities of the data beyond correlational ones (and it’s a big accident that MNIST is learnable by Gaussian methods). But we hope that what AIs learn can largely be well-described by a hierarchical collection of different regimes where the “difference”, suitably operationalized, between the simpler interpretation and the more complicated one is well-modeled by a QFT-like theory (in a reference class that includes perturbatively Gaussian models but is not limited to them). In particular one thing that we’d expect to occur in certain operationalizations of this picture is that once you have some coarse interpretation that correctly captures all generalizing behaviors (but may need to be perturbed/suitably denoised to get good loss), the last and finest emergent layer will be exactly something in the perturbatively Gaussian regime.
  3. Note that I think I’m more bullish about this picture and Lauren is more nuanced (maybe she’ll comment about this). But we both think that it is likely that having good understanding of perturbatively Gaussian renormalization would be useful for “patching in the holes”, as it were, of other interpretability schemes. A low-hanging fruit here is that whenever you have a discrete feature-level interpreatation of a model, instead of just directly measuring the reconstruction loss you should at minimum model the difference model-interpretation as a perturbative Gaussian (corresponding to assuming the difference has “no regularity beyond correlation information”).
3. We don’t want to assume homogeneity, and this is mostly covered by 2b-c above. I think the main point we want to get across is that it’s important and promising to try to go beyond the “homogeneity” picture—and to try to test this in some experiments. I think physics has a good track record here. Not on the level of tigers, but for solid-state models like semiconductors. In this case you have:
  1. The “standard model” only has several-particle interactions (corresponding to the “small-data limit”).
  2. By applying RG techniques to a regular metallic lattice (with initial interactions from the standard model), you end up with a good new universality class of QFT’s (this now contains new particles like phonons and excitons which are dictated by the RG analysis at suitable scales). You can be very careful and figure out the renormalization coupling parameters in this class exactly, but much more realistically and easily you just get them from applying a couple of measurements. On an NN level, “many particles arranged into a metallic pattern” corresponds to some highly regular structure in the data (again, we think “particles” here should correspond to datapoints, at least in the current RLTC paradigm).
  3. The regular metal gives you a “background” theory, and now we view impurities as a discrete random-feature theory on top of this background. Physicists can still run RG on this theory by zooming out and treating the impurities as noise, but in fact you can also understand the theory on a fine-grained level near an impurity by a more careful form of renormalization, where you view the nearest several impurities as discrete sources and only coarsegrain far-away impurities as statistical noise. At least for me, the big hope is that this last move is also possible for ML systems. In other words, when you are interpreting a particular behavior of a neural net, you can model it as a linear combination of a few messy discrete local circuits that apply in this context (like the complicated diagram from Marks et al below) plus a correctly renormalized background theory associated to all other circuits (plus corrections from other layers plus …)
- Simon Pepin Lehalleur 18 Jan 2025 14:37 UTC
  1 point
  0
  Parent
  On 1., you should consider that, for people who don’t know much about QFT and its relationship with SFT (like, say, me 18 months ago), it is not at all obvious that QFT can be applied beyond quantum systems!
  In my case, the first time I read about “QFT for deep learning” I dismissed it automatically because I assumed it would involve some far-fetched analogies with quantum mechanics.
- Simon Pepin Lehalleur 18 Jan 2025 13:06 UTC
  1 point
  0
  Parent
  but in fact you can also understand the theory on a fine-grained level near an impurity by a more careful form of renormalization, where you view the nearest several impurities as discrete sources and only coarsegrain far-away impurities as statistical noise.
  Where could I read about this?
  - Dmitry Vaintrob 19 Jan 2025 0:02 UTC
    2 points
    0
    Parent
    https://www.cond-mat.de/events/correl22/manuscripts/vondelft.pdf