Singular learning theory led to the development of empirical LLC estimators and susceptibilities. no capabilities benefits to my knowledge.
Computational mechanics led to discovering belief state geometry in a variety of cases in transformers. no capabilities benefits to my knowledge
Superposition Hypothesis / Linear Representation Hypothesis led to e.g. development of SAEs and our current conception of features. In my view this had marginal capabilities benefits (activation steering was an interesting case study but often suffers at scale)
Vast majority of MIRI’s work in the 2010s (reflective oracles, decision theory, logical inductors) was useful for our understanding of bounded and ideal reasoners but had little to no impact on capabilities development
I expect most classical circuit theory to be essentially irrelevant for capabilities development and somewhat useful for safety (understanding steganography, adversarially generated backdoors in models, ARC-flavored ideas)
It seems like theory, on the whole, has not really moved the needle on the development of deep learning over the last twenty-odd years. muP is an exception that ~proves the rule. I think this is a pretty general truth about the world—America’s year-on-year 2% GDP growth over the last 150 years was not caused by policies set by a politburo of growth economists.
am curious why you put 10% likelihood on [1] and [2]