Even then, the reason this happens might be plausibly explained by the changing information of the bookstore rather than actual intransitivity.
Matt_Simpson
It’s a Schelling point, er, joke isn’t the right word, but it’s funny because the day was supposed to be a Schelling point. And you forgot about it.
This is simultaneously hilarious and weak evidence that the holiday isn’t working as intended (though I think repeating the holiday every year will do the trick).
I notice that I’m confused: the maximum score on the Quantitative section is 800 (at that time), and Ph.D. econ programs won’t even consider you if you’re under a 780. The quantitative exam is actually really easy for math types. When you sign up for the GRE, you get a free CD with 2 practice exams. When I took it, I took the first practice exam without studying at all and got a 760 or so on the quantitiative section (within 10 pts). After studying I got a 800 on the second practice exam and on the actual exam, I got a 790. The questions were basic algebra for the most part with a bit of calculus and basic stat at the top end and a tricky question here and there. The exam was easy—really easy. I was a math major at a tiny / terrible liberal arts school; nothing like MIT or any self respecting state school. So it seems like it should be easy for anyone with a halfway decent mathematics background.
Now you’re telling me people intending to major in econ in grad school average a 706, and people intending to major in math average a 733? That’s low. Really low relative to my expectations. I would have expected a 730 in econ and maybe a 760 in math.
Possible explanations:
1) Tons of applicants who don’t want to believe that they aren’t cut out for their field create a long tail on the low side while the high side is capped at 800.
2) Master’s programs are, in general, more lenient and there are a large number of people who only intend to go to them, creating the same sort of long tail effect as above in 1).
3) There’s way more low-tier graduate programs than I thought in both fields willing to accept the average or even below average student.
4) Weirdness in how these fields are classified (e.g. I don’t see statistics there anywhere, is that included in math?)
5) the quantitative section of the standard GRE actually doesn’t matter if you’re headed to a math or physics program (someone in that field care to comment?). Note: the quantitative section of the standard GRE does matter in econ, but typically only as a way to make the first cut (usually at 760 or 780, depending on the school). I don’t know much of the details here though.
6) very few people actually study for the GRE like I did—i.e. buy a prep book and work through it. This depresses their scores even though they’re much better quantitatively than I am.
Unsurprisingly since these are in when-I-though-of-them order, 1)-3) appeal to me the most, but 5) and 6) also seem plausible. I don’t see why 4) would bias the scores down instead of up so it seems unlikely a priori.
Not surprising, given my experience. Most religion majors I’ve met were relatively smart and often made fun of the more fundamentalist/evangelical types who typically were turned off by their religion classes. Religion majors seemed like philosophy-lite majors (which is consistent with the rankings).
Edit: Also, relative to Religion, econ has a bunch of poor english speakers that pull the other two categories down. (Note: the “analytical” section is/was actually a couple of very short essays)
That seems to explain why Econ majors get a premium, but that doesn’t seem to explain why econ majors don’t rank higher, or am I missing something?
I didn’t look at the data. I was commenting on your assessment of what they did, which showed that you didn’t know how the F test works. Your post made it seem as if all they did was run an F test that compared the average response of the control and treatment groups and found no difference.
Ok, yeah, translating what the researchers did into a Bayesian framework isn’t quite right either. Phil should have translated what they did into a frequentist framework—i.e. he still straw manned them. See my comment here.
Both the t-test and the F-test work by assuming that every subject has the same response function to the intervention:
response = effect + normally distributed error
where the effect is the same for every subject.
The F test / t test doesn’t quite say that. It makes statements about population averages. More specifically, if you’re comparing the mean of two groups, the t or F test says whether the average response of one group is the same as the other group. Heterogeneity just gets captured by the error term. In fact, econometricians define the error term as the difference between the true response and what their model says the mean response is (usually conditional on covariates).
The fact that the authors ignored potential heterogeneity in responses IS a problem for their analysis, but their result is still evidence against heterogeneous responses. If there really are heterogeneous responses we should see that show up in the population average unless:
The positive and negative effects cancel each other out exactly once you average across the population. (this seems very unlikely)
The population average effect size is nonzero but very small, possibly because the effect only occurs in a small subset of the population (even if it’s large when it does occur) or something similar but more complicated. In this case, a large enough sample size would still detect the effect.
Now it might not be very strong evidence—this depends on sample size and the likely nature of the heterogeneity (or confounders, as Cyan mentions). And in general there is merit in your criticism of their conclusions. But I think you’ve unfairly characterized the methods they used.
- Apr 7, 2013, 5:25 AM; 4 points) 's comment on The Universal Medical Journal Article Error by (
It’s just not an argument against Phil that someone might take some of the data in the paper and do a Bayesian analysis that the authors did not do.
That’s not what I’m saying. I’m saying that what the authors did do IS evidence against the hypothesis in question. Evidence against a homogenous response is evidence against any response (it makes some response less likely)
hey do, but did the paper he dealt with write within a Bayesian framework? I didn’t read it, but it sounded like standard “let’s test a null hypothesis” fare.
You don’t just ignore evidence because someone used a hypothesis test instead of your favorite Bayesian method. P(null | p value) != P(null)
This is a lot like evaporative cooling of group beliefs
My advisor, Jarad Niemi, has posted a bunch of lectures on Bayesian statistics to youtube, most of them short and all pretty good IMHO. The lectures are made for Stat 544, a course at Iowa State University. They assume a decent familiarity with probability theory—most students in the course have seen most of chapters 1-5 of Casella and Berger in detail—and some knowlege of R.
If it is indeed a megameetup, I’d like to attend (from Ames, IA so in the 7 hour range).
EDIT: FWIW I’m also willing to carpool with anyone (nearly) passing through or (nearly) on the way.
I agree, but I’m not sure it was intended as an insult. The effect in (some) readers is similar though, so maybe I’m splitting hairs.
The best way not to do something is to do the best thing you could be doing instead in the best way.
I think so.
I agree
Sure, I’m just saying that personal usefulness shouldn’t be the only reason you upvote.
In response to:
and
I think a hard line needs to be drawn between statistics and epistemology. Statistics is merely a method of approximating epistemology—though a very useful one. The best statistical method in a given situation is the one that best approximates correct epistemology. (I’m not saying this is the only use for statistics, but I can’t seem to make sense of it otherwise)
Now suppose Bayesian epistemology is correct—i.e. let’s say Cox’s theorem + Solomonoff prior. The correct answer to any induction problem is to do the true Bayesian update implied by this epistemology, but that’s not computable. Statistics gives us some common ways to get around this problem. Here are a couple:
1) Bayesian statistics approach: restrict the class of possible models and put a reasonable prior over that class, then do the Bayesian update. This has exactly the same problem that Mencius and Cosma pointed out.
2) Frequentist statistics approach: restrict the class of possible models and come up with a consistent estimate of which model in that class is correct. This has all the problems that Bayesians constantly criticize frequentists for, but it typically allows for a much wider class of possible models in some sense (crucially, you often don’t have to assume distributional forms)
3) Something hybrid: e.g., Bayesian statistics with model checking. Empirical Bayes (where the prior is estimated from the data). Etc.
Now superficially, 1) looks the most like the true Bayesian update—you don’t look at the data twice, and you’re actually performing a Bayesian update. But you don’t get points for looking like the true Bayesian update, you get points for giving the same answer as the true Bayesian update. If you do 1), there’s always some chance that the class of models you’ve chosen is too restrictive for some reason. Theoretically you could continue to do 1) by just expanding the class of possible models and putting a prior over that class, but at some point that becomes computationally infeasible. Model checking is a computationally feasible way of approximating this process. And, a priori, I see no reason to think that some frequentist method won’t give the best computationally feasible approximation in some situation.
So, basically, a “hardline Bayesian” should do model checking and sometimes even frequentist statistics. (Similarly, a “hardline frequentist” in the epistemological sense should sometimes do Bayesian statistics. And, in fact, they do this all the time in econometrics.)
See my similar comments here and here.