Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven’t conceived of.
Yes, the implicit assumption here is that the model is true.
Low likelihood of data under a hypothesis in no way implies rejection of that hypothesis.
\6. Therefore the alternative hypothesis is true.
Without also calculating the likelihood under the alternative hypothesis (it may be less), this is unjustified as well.
I don’t think you understood my point. I’m avoiding claiming any inductive theory is correct—including Bayes’ - and trying to show how hypothesis testing may be a way to do induction while simultaneously being agnostic about the correct theory. That Bayesian theory rejects certain steps of the hypothesis testing process is irrelevant to my point (and if you read closely, you’ll see that I acknowledge it anyway).
I think that’s a bad assumption, and if you’re trying to steelman, you should avoid relying on bad assumptions.
In any given problem the model is almost certainly false, but whether you use frequentist or Bayesian inference you have to implicitly assume that it’s (approximately) true in order to actually conduct inference. Saying “don’t assume the model is true because it isn’t” is unhelpful and a nonstarter. If you actually want to get an answer, you have to assume something even if you know it isn’t quite right.
Going from 4 to 5 looks dependent on an inductive theory to me.
Why yes it does. Did you read what I wrote about that?
Saying “don’t assume the model is true because it isn’t” is unhelpful and a nonstarter.
It starts fine for me.
Testing just the Null hypothesis is the least one can do. Then one can test the alternative, That way you at least get a likelihood ratio. You can add priors or not. Then one can build in terms modeling your ignorance.
Testing just the Null hypothesis is the least one can do. Then one can test the alternative, That way you at least get a likelihood ratio. You can add priors or not. Then one can build in terms modeling your ignorance.
This doesn’t address the problem that the truth isn’t in your hypothesis space (which is what I thought you were criticizing me for). If your model assumes constant variance, for example, when in truth there’s nonconstant variance, the truth is outside your hypothesis space. You’re not even considering it as a possibility. What does considering likelihood ratios of the hypotheses in your hypothesis space do to help you out here?
Reading that thread, I think jsteinart is right—if the truth is outside of your hypothesis space, you’re screwed no matter if you’re a Bayesian or a frequentist (which is a much more succinct way of putting my response to you). Setting up a “everything else” hypothesis doesn’t really help because you can’t compute a likelihood without some assumptions that, in all probability, expose you to the problem you’re trying to avoid.
Yes. It conflicted with what you subsequently wrote:
Are you happier if I say that Bayes is a “thick” inductive theory and that NHST can be viewed as induction with a “thin” theory which therefore keeps you from committing yourself to as much? (I do acknowledge that others treat NHST as a “thick” theory and that this difference seems like it should result in differences in the details of actually doing hypothesis tests.)
What does considering likelihood ratios of the hypotheses in your hypothesis space do to help you out here?
The likelihood ratio was for comparing the hypotheses under consideration, the Null and the alternative. My point is that the likelihood of the alternative isn’t taken into consideration at all. Prior to anything Bayesian, hypothesis testing moved from only modeling the likelihood of the null to also modeling the likelihood of a specified alternative, and comparing the two.
if the truth is outside of your hypothesis space, you’re screwed no matter if you’re a Bayesian or a frequentist
Therefore, you put an error placeholder of appropriate magnitude onto “it’s out of my hypothesis space” so that unreasonable results have some systematic check.
And the difference between Bayesian and NHST isn’t primarily how many assumptions you’ve committed too, which is enormous, but how many of those assumptions you’ve identified, and how you’ve specified them.
Going from 4 to 5 seems to me like silently changing “if A then B” to “if B then A”. Which is a logical mistake that many people do.
More precisely, it is a silent change from “if NULL, then DATA with very low proability” to “if DATA, then NULL with very low probability”.
Specific example: Imagine a box containing 1 green circle, 10 red circles, and 100 red squares; you choose a random item. It is true that “if you choose a red item, it is unlikely to be a circle”. But it is not true that “if you choose a circle, it is unlikely to be red”.
Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven’t conceived of.
If the truth doesn’t exist in your hypothesis space then Bayesian methods are just as screwed as frequentist methods. In fact, Bayesian methods can grow increasingly confident that an incorrect hypothesis is true in this case. I don’t see how this is a weakness of Matt’s argument.
The details are hazy at this point, but by assigning a realistic probability to the “Something else” hypothesis, you avoid making over confident estimates of your other hypotheses in a multiple hypothesis testing scenario.
See Multiple Hypothesis Testing in Jaynes PTTLOS, starting pg. 98, and the punchline on pg. 105:
In summary, the role of our new hypothesis C was only to be held in abeyace until needed, like a fire extinguisher. In a normal testing situation, it is “dead”, playing no part in the inference because its probability remains far below that of the other hypotheses. But a dead hypothesis can be brought back to life by very unexpected data.
I think this is especially relevant to standard “null hypothesis” hypothesis testing because the likelihood of the data under the alternative hypothesis is never calculated, so you don’t even get a hint that your model might just suck, and instead conclude that the null hypothesis should be rejected.
What is the likelihood of the “something else” hypothesis? I don’t think this is really a general remedy.
Also, you can get the same thing in the hypothesis testing framework by doing two hypothesis tests, one of which is a comparison to the “something else” hypothesis and one of which is a comparison to the original null hypothesis.
Finally, while I forgot to mention this above, in most cases where hypothesis testing is applied, you actually are considering all possibilities, because you are doing something like P0 = “X ⇐ 0”, P1 = “X > 0″ and these really are logically the only possibilities =) [although I guess often you need to make some assumptions on the probabilistic dependencies among your samples to get good bounds].
Yes, you can say it in that framework. And you should. That’s part of the steelmanning exercise—putting in the things that are missing. If you steelman enough, you get to be a good bayesian.
P0 = “X ⇐ 0” and {All My other assumptions} NOT(P0) = NOT(“X ⇐ 0″) or NOT({All My other assumptions})
Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven’t conceived of.
Low likelihood of data under a hypothesis in no way implies rejection of that hypothesis.
Without also calculating the likelihood under the alternative hypothesis (it may be less), this is unjustified as well.
Yes, the implicit assumption here is that the model is true.
I don’t think you understood my point. I’m avoiding claiming any inductive theory is correct—including Bayes’ - and trying to show how hypothesis testing may be a way to do induction while simultaneously being agnostic about the correct theory. That Bayesian theory rejects certain steps of the hypothesis testing process is irrelevant to my point (and if you read closely, you’ll see that I acknowledge it anyway).
I think that’s a bad assumption, and if you’re trying to steelman, you should avoid relying on bad assumptions.
Going from 4 to 5 looks dependent on an inductive theory to me.
In any given problem the model is almost certainly false, but whether you use frequentist or Bayesian inference you have to implicitly assume that it’s (approximately) true in order to actually conduct inference. Saying “don’t assume the model is true because it isn’t” is unhelpful and a nonstarter. If you actually want to get an answer, you have to assume something even if you know it isn’t quite right.
Why yes it does. Did you read what I wrote about that?
It starts fine for me.
Testing just the Null hypothesis is the least one can do. Then one can test the alternative, That way you at least get a likelihood ratio. You can add priors or not. Then one can build in terms modeling your ignorance.
See previous comment: http://lesswrong.com/lw/gqt/the_logic_of_the_hypothesis_test_a_steel_man/8ioc
One could keep going and going on modeling ignorance, but few even get that far, and I suspect it isn’t helpful to go further.
Yes. It conflicted with what you subsequently wrote:
This doesn’t address the problem that the truth isn’t in your hypothesis space (which is what I thought you were criticizing me for). If your model assumes constant variance, for example, when in truth there’s nonconstant variance, the truth is outside your hypothesis space. You’re not even considering it as a possibility. What does considering likelihood ratios of the hypotheses in your hypothesis space do to help you out here?
Reading that thread, I think jsteinart is right—if the truth is outside of your hypothesis space, you’re screwed no matter if you’re a Bayesian or a frequentist (which is a much more succinct way of putting my response to you). Setting up a “everything else” hypothesis doesn’t really help because you can’t compute a likelihood without some assumptions that, in all probability, expose you to the problem you’re trying to avoid.
Are you happier if I say that Bayes is a “thick” inductive theory and that NHST can be viewed as induction with a “thin” theory which therefore keeps you from committing yourself to as much? (I do acknowledge that others treat NHST as a “thick” theory and that this difference seems like it should result in differences in the details of actually doing hypothesis tests.)
The likelihood ratio was for comparing the hypotheses under consideration, the Null and the alternative. My point is that the likelihood of the alternative isn’t taken into consideration at all. Prior to anything Bayesian, hypothesis testing moved from only modeling the likelihood of the null to also modeling the likelihood of a specified alternative, and comparing the two.
Therefore, you put an error placeholder of appropriate magnitude onto “it’s out of my hypothesis space” so that unreasonable results have some systematic check.
And the difference between Bayesian and NHST isn’t primarily how many assumptions you’ve committed too, which is enormous, but how many of those assumptions you’ve identified, and how you’ve specified them.
Going from 4 to 5 seems to me like silently changing “if A then B” to “if B then A”. Which is a logical mistake that many people do.
More precisely, it is a silent change from “if NULL, then DATA with very low proability” to “if DATA, then NULL with very low probability”.
Specific example: Imagine a box containing 1 green circle, 10 red circles, and 100 red squares; you choose a random item. It is true that “if you choose a red item, it is unlikely to be a circle”. But it is not true that “if you choose a circle, it is unlikely to be red”.
If the truth doesn’t exist in your hypothesis space then Bayesian methods are just as screwed as frequentist methods. In fact, Bayesian methods can grow increasingly confident that an incorrect hypothesis is true in this case. I don’t see how this is a weakness of Matt’s argument.
The details are hazy at this point, but by assigning a realistic probability to the “Something else” hypothesis, you avoid making over confident estimates of your other hypotheses in a multiple hypothesis testing scenario.
See Multiple Hypothesis Testing in Jaynes PTTLOS, starting pg. 98, and the punchline on pg. 105:
I think this is especially relevant to standard “null hypothesis” hypothesis testing because the likelihood of the data under the alternative hypothesis is never calculated, so you don’t even get a hint that your model might just suck, and instead conclude that the null hypothesis should be rejected.
What is the likelihood of the “something else” hypothesis? I don’t think this is really a general remedy.
Also, you can get the same thing in the hypothesis testing framework by doing two hypothesis tests, one of which is a comparison to the “something else” hypothesis and one of which is a comparison to the original null hypothesis.
Finally, while I forgot to mention this above, in most cases where hypothesis testing is applied, you actually are considering all possibilities, because you are doing something like P0 = “X ⇐ 0”, P1 = “X > 0″ and these really are logically the only possibilities =) [although I guess often you need to make some assumptions on the probabilistic dependencies among your samples to get good bounds].
Yes, you can say it in that framework. And you should. That’s part of the steelmanning exercise—putting in the things that are missing. If you steelman enough, you get to be a good bayesian.
P0 = “X ⇐ 0” and {All My other assumptions}
NOT(P0) = NOT(“X ⇐ 0″) or NOT({All My other assumptions})