This post and comment got me reflecting and I wanted to share a model I have on conceptual work and what the difference to empirical work is. The TL;DR is that conceptual work is a bit like choosing axioms whilst empirical work is proving the probabilistic underlying claims that the axioms imply.
(I had Claude help me generate the red thread and scaffolding for the shortform based on my instructions but I rewrote it from there.)
The Axiom Selection Problem
Any formal system requires foundational axioms—unprovable assumptions that can’t be justified from within the system itself. The choice between Euclidean and non-Euclidean geometry isn’t about which is “correct” but which is useful for specific purposes.
Joshua Bach frames this well when discussing consciousness: you can’t directly verify the substrate you’re running on—you can only observe its causal consequences. This creates a fundamental verification barrier.
This extends directly to AI systems. When we build models, we embed specific axiomatic frameworks that determine:
What patterns can be recognized
Which relationships are meaningful
What optimizations are prioritized
These choices constitute the frame through which the AI interprets reality, but the system cannot step outside its own frame to evaluate whether these axioms were appropriate.
The Example in the post
John Wentworth captures this in his comment on automating alignment research:
“If we’re imagining an LLM-like automated researcher, then a question I’d consider extremely important is: ‘Is this model/analysis/paper/etc missing key conceptual pieces?’ If the system says ‘yes, here’s something it’s missing’ then I can (usually) verify that. But if it says ‘nope, looks good’… then I can’t verify that the paper is in fact not missing anything.”
This is the verification paradox in action—completeness cannot be verified from within the system itself. As Wentworth notes, if you train an AI to output verifiable insights, there’s no way to verify that it isn’t “missing lots of things all the time.”
Multiple Frames vs. Single Optimization
In geometric deep learning, different inductive biases create different generative priors. In my head these are similar to the axiomatic statements that any logical or philosophical theory is based upon but they’re the ground for probabilistic optimisation instead.
I think this is part of a larger question around models in philsophy of science. The ML community has largely adopted what physicists call the “shut up and calculate” approach—mostly optimizing within the bitter lesson regime. This has yielded impressive results but creates blind spots similar to what’s been happening in particle physics (as far as I’ve heard, I’m not an expert so I could be wrong.), where experiments are run that fail to test the underlying theories.
The key issue isn’t just optimizing within a frame, but developing the capacity to move between frames—to recognize when one axiomatic system is more appropriate than another. Are we training AI systems to identify when their frame of reference is inadequate? To recognize the limitations of their axioms?
Models Are Wrong But Useful
As George Box noted, “all models are wrong, but some are useful.” The challenge isn’t finding the one true model but developing systems that can navigate between multiple imperfect models.
Our current scaling paradigm might be fundamentally limited here. When we optimize for performance within a fixed frame, we may not be developing the meta-cognitive capacity to recognize when that frame itself is inadequate.
This creates a verification bottleneck—if our AIs can’t question their own frameworks, how can we trust their judgments about their own capabilities and limitations? If science is about asking the right questions and selecting appropriate frames, this limitation becomes crucial.
The path forward isn’t abandoning verification but recognizing that we need complementary frames of reference that can mutually constrain each other. No single frame will ever be complete, but a system that can navigate between multiple frames might approach a more robust form of understanding.
I’m increasingly convinced that frame-shifting capacity may be as important for alignment as optimization within frames—and that our current approaches may not be developing this capacity sufficiently.
For an even more in-depth technical view on this within complexity science, I really enjoyed the following two articles:
This post and comment got me reflecting and I wanted to share a model I have on conceptual work and what the difference to empirical work is. The TL;DR is that conceptual work is a bit like choosing axioms whilst empirical work is proving the probabilistic underlying claims that the axioms imply.
(I had Claude help me generate the red thread and scaffolding for the shortform based on my instructions but I rewrote it from there.)
The Axiom Selection Problem
Any formal system requires foundational axioms—unprovable assumptions that can’t be justified from within the system itself. The choice between Euclidean and non-Euclidean geometry isn’t about which is “correct” but which is useful for specific purposes.
Joshua Bach frames this well when discussing consciousness: you can’t directly verify the substrate you’re running on—you can only observe its causal consequences. This creates a fundamental verification barrier.
This extends directly to AI systems. When we build models, we embed specific axiomatic frameworks that determine:
What patterns can be recognized
Which relationships are meaningful
What optimizations are prioritized
These choices constitute the frame through which the AI interprets reality, but the system cannot step outside its own frame to evaluate whether these axioms were appropriate.
The Example in the post
John Wentworth captures this in his comment on automating alignment research:
This is the verification paradox in action—completeness cannot be verified from within the system itself. As Wentworth notes, if you train an AI to output verifiable insights, there’s no way to verify that it isn’t “missing lots of things all the time.”
Multiple Frames vs. Single Optimization
In geometric deep learning, different inductive biases create different generative priors. In my head these are similar to the axiomatic statements that any logical or philosophical theory is based upon but they’re the ground for probabilistic optimisation instead.
I think this is part of a larger question around models in philsophy of science. The ML community has largely adopted what physicists call the “shut up and calculate” approach—mostly optimizing within the bitter lesson regime. This has yielded impressive results but creates blind spots similar to what’s been happening in particle physics (as far as I’ve heard, I’m not an expert so I could be wrong.), where experiments are run that fail to test the underlying theories.
The key issue isn’t just optimizing within a frame, but developing the capacity to move between frames—to recognize when one axiomatic system is more appropriate than another. Are we training AI systems to identify when their frame of reference is inadequate? To recognize the limitations of their axioms?
Models Are Wrong But Useful
As George Box noted, “all models are wrong, but some are useful.” The challenge isn’t finding the one true model but developing systems that can navigate between multiple imperfect models.
Our current scaling paradigm might be fundamentally limited here. When we optimize for performance within a fixed frame, we may not be developing the meta-cognitive capacity to recognize when that frame itself is inadequate.
This creates a verification bottleneck—if our AIs can’t question their own frameworks, how can we trust their judgments about their own capabilities and limitations? If science is about asking the right questions and selecting appropriate frames, this limitation becomes crucial.
The path forward isn’t abandoning verification but recognizing that we need complementary frames of reference that can mutually constrain each other. No single frame will ever be complete, but a system that can navigate between multiple frames might approach a more robust form of understanding.
I’m increasingly convinced that frame-shifting capacity may be as important for alignment as optimization within frames—and that our current approaches may not be developing this capacity sufficiently.
For an even more in-depth technical view on this within complexity science, I really enjoyed the following two articles:
“Model-free“ analysis of a complex system
“Model-free“ analysis of a complex system. Part II