I’d understood the question to be “given identical scores”, not “given a 10 point average difference in favor of the blue weasel”.
My point was that ‘suppose that the true shrinkage leads to an adjusted difference of 10 points between the two groups; how much of a gift does 10 extra points represent?’ By using the nominal score rather than the true score, this has the effect of inflating the score. Once you’ve established how much the inflation might be, it’s natural to wonder about how much real-world consequence it might have leading into the Harvard musings.
i.e. we take a random sample of 100 men and 100 women with SAT scores between 1200-1400 (high but not perfect scores). Are the male scores going to average better than the females?
Depends on the mean and standard deviations of the 2 distributions, and then you could estimate how often the male sample average will be higher than the female sample average and vice versa.
The question should be ‘if we retest these 1200-1400 scorers, what will happen?’ The scores will probably drop as they regress to their mean due to an imperfect test. That’s the point.
The question should be ‘if we retest these 1200-1400 scorers, what will happen?’ The scores will probably drop as they regress to their mean due to an imperfect test. That’s the point.
Ahhh, that makes the statistics click in my brain, thanks :)
Do you know if there is much data out there on real-world gender differences vis-a-vis regression to the mean on IQ / SAT / etc. tests? i.e. is this based on statistics, or is it born out in empirical observations?
Do you know if there is much data out there on real-world gender differences vis-a-vis regression to the mean on IQ / SAT / etc. tests? i.e. is this based on statistics, or is it born out in empirical observations?
I haven’t seen any, offhand. Maybe the testing company provides info about retests, but then you’re going to have different issues: anyone who takes the second test may be doing so because they had a bad day (giving you regression to a mean from the other direction) and may’ve boned up on test prep since, and there’s the additional issue of test-retest effect—now that they know what the test is like, they will be less anxious and will know what to do, and test-takers in general may score better. (Since I’m looking at that right now, my DNB meta-analysis offers a case in point: in many of the experiments, the controls have slightly higher post-test IQ scores. Just the test-retest effect.)
My point was that ‘suppose that the true shrinkage leads to an adjusted difference of 10 points between the two groups; how much of a gift does 10 extra points represent?’ By using the nominal score rather than the true score, this has the effect of inflating the score. Once you’ve established how much the inflation might be, it’s natural to wonder about how much real-world consequence it might have leading into the Harvard musings.
Depends on the mean and standard deviations of the 2 distributions, and then you could estimate how often the male sample average will be higher than the female sample average and vice versa.
The question should be ‘if we retest these 1200-1400 scorers, what will happen?’ The scores will probably drop as they regress to their mean due to an imperfect test. That’s the point.
Ahhh, that makes the statistics click in my brain, thanks :)
Do you know if there is much data out there on real-world gender differences vis-a-vis regression to the mean on IQ / SAT / etc. tests? i.e. is this based on statistics, or is it born out in empirical observations?
I haven’t seen any, offhand. Maybe the testing company provides info about retests, but then you’re going to have different issues: anyone who takes the second test may be doing so because they had a bad day (giving you regression to a mean from the other direction) and may’ve boned up on test prep since, and there’s the additional issue of test-retest effect—now that they know what the test is like, they will be less anxious and will know what to do, and test-takers in general may score better. (Since I’m looking at that right now, my DNB meta-analysis offers a case in point: in many of the experiments, the controls have slightly higher post-test IQ scores. Just the test-retest effect.)