Science, and more specifically physics, is built on first theorizing or philosophizing, coming up with a lot of potential worlds a priori, and only looking to see which one you probably fall in after the philosophizing is done.
This is a mainstream position, which is sort of, kind of true in broad strokes, but misses very important context, therefore leaving people confused about all the interesting nuance, preventing further progress. So I push back against it.
The important context/interesting nuances are:
There is no pure reasoning, disentangled from the reality you live in. Even if you tried to ignore all your personal experience while trying to come up with a theory, reasoning ability of your brain is already a result of optimization process inside the reality, encompassing generations of life experience in it.
People do not actually try to ignore their experience while coming up with the theories. They generalize and formalize this experience. Presenting it as if they first came up with general principles which were then validated by the experience for the sake of readers comprehension, even though the casual process that led to her discovery of the theory is different.
These principles are not left unchanged forever once they were formalized. We go back and retcon them when needed based on more experience we accumulate. It’s not first reason then look, it’s keep looking and reasoning to the best of your abilities at the same time to the point where the whole distinction between looking and reasoning is artificial in the first place.
I bet a good philosopher from a different universe could come up with the concept of trees
Crucially, the other universe has to be similar enough to our universe, so that reasoning that evolved in the other universe applies to ours as well.
I think my issue with empiricism is that it does not generalize, at all.
Presenting it as if they first came up with general principles which were then validated by the experience for the sake of readers comprehension, even though the casual process that led to her discovery of the theory is different.
I’m not sure what this was supposed to be. It’s a noun phrase, not a sentence.
I think your main counterpoint to what I said is that people are doing an optimization process where they look at the data while simultaneously doing a search for a better theory. In fact, you cannot even disentangle their brain from the reality that created and runs it, so even a best attempt at theory first, observation second is doomed to fail.
I think the second, stronger sentence is mostly wrong. You do not need a universe similar enough to our universe to produce reasoning similar to ours, just one that can produce similar reasoning and has an incentive to. That incentive can be as little as, “I wonder what physics looks like in 3+1 dimensions?” just like our physicists wonder what it looks like in more or less dimensions, with different fundamental constants, with different laws of motion, with positive spacetime curvature, and so on. Or, we can just shove a bunch of data from our universe into theirs, and reward them for figuring it out (i.e. training LLMs).
As for the first, weaker sentence, yes this is true. Pretty much everyone has tight feedback loops, probably because the search space is too large to first categorize its entirety and then match the single branch you end up observing. I think the role of observation here is closer to moving attention to certain areas of the search space, rather than moving the search tree forward (see Richard Ngo’s shortform on chess). The thing is, this process is unnecessary for simple things. You probably learned to solve TicTacToe by playing a bunch of games, but you could have just solved it. I think the concept of trees are relatively simple, though of course if you want a refined concept like its protein composition or DNA sequencing, yeah that space is too big and you probably have to just go out and observe it.
I don’t really understand your point about unsupervised learning. With unsupervised learning, you can just run a bunch of data through your model until it learns something. That’s the observation → theory pipeline and it’s astoundingly inefficient and bad at generalization. Humans could do the same with 100x fewer examples, which is the gap models need to clear to solve ARC-AGI. Humans are probably doing something closer to theory → observation.
This is a mainstream position, which is sort of, kind of true in broad strokes, but misses very important context, therefore leaving people confused about all the interesting nuance, preventing further progress. So I push back against it.
The important context/interesting nuances are:
There is no pure reasoning, disentangled from the reality you live in. Even if you tried to ignore all your personal experience while trying to come up with a theory, reasoning ability of your brain is already a result of optimization process inside the reality, encompassing generations of life experience in it.
People do not actually try to ignore their experience while coming up with the theories. They generalize and formalize this experience. Presenting it as if they first came up with general principles which were then validated by the experience for the sake of readers comprehension, even though the casual process that led to her discovery of the theory is different.
These principles are not left unchanged forever once they were formalized. We go back and retcon them when needed based on more experience we accumulate. It’s not first reason then look, it’s keep looking and reasoning to the best of your abilities at the same time to the point where the whole distinction between looking and reasoning is artificial in the first place.
Crucially, the other universe has to be similar enough to our universe, so that reasoning that evolved in the other universe applies to ours as well.
Have you heard about unsupervised learning?
I’m not sure what this was supposed to be. It’s a noun phrase, not a sentence.
“experience. Presenting” should be “experience, presenting”, I think.
Presumably prepend or append “is wrong.”
I think your main counterpoint to what I said is that people are doing an optimization process where they look at the data while simultaneously doing a search for a better theory. In fact, you cannot even disentangle their brain from the reality that created and runs it, so even a best attempt at theory first, observation second is doomed to fail.
I think the second, stronger sentence is mostly wrong. You do not need a universe similar enough to our universe to produce reasoning similar to ours, just one that can produce similar reasoning and has an incentive to. That incentive can be as little as, “I wonder what physics looks like in 3+1 dimensions?” just like our physicists wonder what it looks like in more or less dimensions, with different fundamental constants, with different laws of motion, with positive spacetime curvature, and so on. Or, we can just shove a bunch of data from our universe into theirs, and reward them for figuring it out (i.e. training LLMs).
As for the first, weaker sentence, yes this is true. Pretty much everyone has tight feedback loops, probably because the search space is too large to first categorize its entirety and then match the single branch you end up observing. I think the role of observation here is closer to moving attention to certain areas of the search space, rather than moving the search tree forward (see Richard Ngo’s shortform on chess). The thing is, this process is unnecessary for simple things. You probably learned to solve TicTacToe by playing a bunch of games, but you could have just solved it. I think the concept of trees are relatively simple, though of course if you want a refined concept like its protein composition or DNA sequencing, yeah that space is too big and you probably have to just go out and observe it.
I don’t really understand your point about unsupervised learning. With unsupervised learning, you can just run a bunch of data through your model until it learns something. That’s the observation → theory pipeline and it’s astoundingly inefficient and bad at generalization. Humans could do the same with 100x fewer examples, which is the gap models need to clear to solve ARC-AGI. Humans are probably doing something closer to theory → observation.