Louis Jaburi comments on Against blanket arguments against interpretability

Louis Jaburi 27 Jan 2025 16:01 UTC
1 point
0
The relevant question then becomes whether the “SGLD” sampling techniques used in SLT for measuring the free energy (or technically its derivative) actually converge to reasonable values in polynomial time. This is checked pretty extensively in this paper for example.
The linked paper considers only large models which are DLNs. I don’t find this too compelling evidence for large models with non-linearities. Other measurements I have seen for bigger/deeper non-linear models seem promising, but I wouldn’t call them robust yet (though it is not clear to me if this is because of an SGLD implementation/hyperparameter issue or if there is a more fundamental problem here).
As long as I don’t have a more clear picture of the relationship between free energy and training dynamics under SGD, I agree with OP that the claim is too strong.