Actually, one of the reason I stood by this interpretation of Popper was because one of the quotes posted in one of the other threads here:
“the falsificationists or fallibilists say, roughly speaking, that what cannot (at present) in principle be overthrown by criticism is (at present) unworthy of being seriously considered; while what can in principle be so overthrown and yet resists all our critical efforts to do so may quite possibly be false, but is at any rate not unworthy of being seriously considered and perhaps even of being believed”
Which is apparently from Conjectures and Refutations, pg 309. Regardless, I don’t care about this argument overmuch, since we seem to have moved on to some other points.
[Solomonoff Induction] does not solve the problem to my satisfaction. It orders theories which make identical predictions (about all our data, but not about the unknown) and then lets you differentiate by that order. But isn’t that ordering arbitrary? It’s just not true that short and simple theories are always best; sometimes the truth is complicated.
Remember that in Bayesian epistemology, probabilities represent our state of knowledge, so as you pointed out, the simplest hypothesis that fits the data so far may not be the true one because we haven’t seen all of the data. But it is necessarily our best guess because of the conjunction rule.
There are so many problems here that it’s hard to choose a starting point.
1) the data set you are using is biased (it is selective. all observation is selective)
2) there is no such thing as “raw data”—all your observations are interpreted. your interpretations may be mistaken.
3) what do you mean by “best guess”? one meaning is “most likely to be the final, perfect truth”. but a different meaning is “most useful now”.
4) You say “probabilities represent our state of knowledge”. However there are infinitely many theories with the same probability. Or there would be, except for your solomonoff prior about simpler theories having higher probability. So the important part of “state of our knowledge” as represented by these probabilities consists mostly of the solomonoff prior and nothing else, because it, and it alone, is dealing with the hard problem of epistemology (dealing with theories which make identical predictions about everything we have data for).
5) you can have infinite data and still get all non-emprical issues wrong
6) regarding the conjunction rule, there is miscommunication. this does not address the point i was trying to make. i think you have a premise like “all more complicated theories are merely conjunctions of simpler theories”. But that is to conceive of theories very differently than Popperians do, in what we see as a limited and narrow way. To begin to address these issues, let’s consider what’s better: a bald assertion, or an assertion and an explanation of why it is correct? If you want “most likely to happen to be the perfect, final truth” you are better off with only an unargued assertion (since any argument may be mistaken). But if you want to learn about the world, you are better off not relying on unargued assertions.
Remember that in Bayesian epistemology, probabilities represent our state of knowledge, so as you pointed out, the simplest hypothesis that fits the data so far may not be the true one because we haven’t seen all of the data. But it is necessarily our best guess because of the conjunction rule.
You are going to have to expand on this. I don’t see how the conjunction rule implies that simpler hypotheses are in general more probable. This is true if we have two hypotheses where one is X and the other is “X and Y” but that’s not how people generally apply this sort of thing. For example, I might have a sequence of numbers that for the first 10,000 terms has the nth term as the nth prime number. One hypothesis is that the nth term is always the nth prime number. But I could have as another hypothesis some high degree polynomial that matches the first 10,000 primes. That’s clearly more complicated. But one can’t use conjunction to argue that it is less likely.
Imagine that I have some set of propositions, A through Z, and I don’t know the probabilities of any of these. Now let’s say I’m using these propositions to explain some experimental result—since I would have uniform priors for A through Z, it follows that an explanation like “M did it” is more probable than “A and B did it,” which in turn is more probable than “G and P and H did it.”
Yes, I agree with you there. But this is much weaker than any general form of Occam. See my example with primes. What we want to say in some form of Occam approach is much stronger than what you can get from simply using the conjunction argument.
Actually, one of the reason I stood by this interpretation of Popper was because one of the quotes posted in one of the other threads here:
Which is apparently from Conjectures and Refutations, pg 309. Regardless, I don’t care about this argument overmuch, since we seem to have moved on to some other points.
Remember that in Bayesian epistemology, probabilities represent our state of knowledge, so as you pointed out, the simplest hypothesis that fits the data so far may not be the true one because we haven’t seen all of the data. But it is necessarily our best guess because of the conjunction rule.
There are so many problems here that it’s hard to choose a starting point.
1) the data set you are using is biased (it is selective. all observation is selective)
2) there is no such thing as “raw data”—all your observations are interpreted. your interpretations may be mistaken.
3) what do you mean by “best guess”? one meaning is “most likely to be the final, perfect truth”. but a different meaning is “most useful now”.
4) You say “probabilities represent our state of knowledge”. However there are infinitely many theories with the same probability. Or there would be, except for your solomonoff prior about simpler theories having higher probability. So the important part of “state of our knowledge” as represented by these probabilities consists mostly of the solomonoff prior and nothing else, because it, and it alone, is dealing with the hard problem of epistemology (dealing with theories which make identical predictions about everything we have data for).
5) you can have infinite data and still get all non-emprical issues wrong
6) regarding the conjunction rule, there is miscommunication. this does not address the point i was trying to make. i think you have a premise like “all more complicated theories are merely conjunctions of simpler theories”. But that is to conceive of theories very differently than Popperians do, in what we see as a limited and narrow way. To begin to address these issues, let’s consider what’s better: a bald assertion, or an assertion and an explanation of why it is correct? If you want “most likely to happen to be the perfect, final truth” you are better off with only an unargued assertion (since any argument may be mistaken). But if you want to learn about the world, you are better off not relying on unargued assertions.
You are going to have to expand on this. I don’t see how the conjunction rule implies that simpler hypotheses are in general more probable. This is true if we have two hypotheses where one is X and the other is “X and Y” but that’s not how people generally apply this sort of thing. For example, I might have a sequence of numbers that for the first 10,000 terms has the nth term as the nth prime number. One hypothesis is that the nth term is always the nth prime number. But I could have as another hypothesis some high degree polynomial that matches the first 10,000 primes. That’s clearly more complicated. But one can’t use conjunction to argue that it is less likely.
Imagine that I have some set of propositions, A through Z, and I don’t know the probabilities of any of these. Now let’s say I’m using these propositions to explain some experimental result—since I would have uniform priors for A through Z, it follows that an explanation like “M did it” is more probable than “A and B did it,” which in turn is more probable than “G and P and H did it.”
Yes, I agree with you there. But this is much weaker than any general form of Occam. See my example with primes. What we want to say in some form of Occam approach is much stronger than what you can get from simply using the conjunction argument.