# The irrelevance of test scores is greatly exaggerated

Here’s some claims about how grades (GPA) and test scores (ACT) predict success in college.

• def printTheNews(science, ideology):

if science.getTheme() not in ideology.keys():

return

print(science.getTheme(), “says”, ideology[science.getTheme()])

• Clearly the press does not care about code quality, because that’s not Pythonic :(

The pythonic version is science.theme - you don’t need a getter

• I dunno, the press will swallow anything, and then it goes through these cycles of lethargy...

• I gotta say, I never get tired of epistemic walkthroughs of peer-reviewed papers. Upvote for you!

• A more general observation that I’m sure has been stated many times but clicked for me while reading this: Once you condition on the output of a prediction process, correlations are residuals. Positive/​negative/​zero coefficients then map not to good/​bad/​irrelevant but to underrated/​overrated/​valued accurately.

(“Which college a student attends” is the output of a prediction process insofar as diff students attend the most selective college that accepts them and colleges differ only in their admission cutoffs on a common scoring function, I think).

• Very well stated. I would be interested in a link to something that describes that principle, the outcome of the prediction process.

• Here’s an argument for why the study’s conclusions are unsupported.

-----

Suppose that there are lots of things that go into predicting what makes a student successful. There’s ACT score, and GPA, and leadership, and race, and socioeconomic status, and countless other things.

Now, suppose colleges have tried to figure out the weightings for each of those factors, and shared their results with each other. They all compute “success scores” for each student.

Harvard takes the top 1000 applicants by score. MIT takes the next 1000. Princeton takes the third 1000. And so on.

So, what happens when you run a regression to predict success from ACT/​GPA/​etc, while controlling for school?

Well, if the formula is correct, nothing is significant!

Consider Princeton. All its success scores are, say, between +2.04 (Z-score) and +2.02, because it takes a specific thin slice of the population. That means that all the students are roughly equal. So if you find a student with a higher ACT score, he’s probably got a lower GPA. Because, if he was that high in both, he’d be higher than +2.04 overall and wind up at Harvard instead of Princeton.

In other words, NOTHING correlates to success, controlling for school, if colleges are good enough at predicting who will succeed.

Sure, there’s a small amount of slack, between +2.02 and +2.04, but it’s nowhere near enough to produce statistically significant evidence that any factor is important. Almost 100% of the variance is between schools, not within schools.

So that leaves noise. Any coefficients you find that are non-zero are probably just random artifacts.

Or … they are systematic errors in how schools evaluate students.

In this particular study, they found that controlling for school, GPA was important to success but ACT score was not.

Well, all that means is that colleges are not weighting GPA highly enough. It does NOT mean that GPA is more important than ACT score, or any other factor—only that GPA is more important *after you account for the college’s choice in whom to admit*. It could be that the colleges are giving GPA/​ACT a 1:15 ratio, and it should be only 1:10 instead. In other words, ACT could still be hugely more important than GPA, but the schools are making it a little TOO huge.

Even if everything in the study is correct, I would argue they misunderstood what they were measuring, and what the results mean. They only mean colleges are underestimating GPA relative to ACT, not that GPA is more important than ACT.

-----

Here’s an analogy:

A store will only let you in if you have exactly $1000 worth of large bills in your wallet. An academic study measures how much stuff you get based on all the money in your wallet, including small bills. Since everyone has exactly$1000 in large bills, the regression can’t deal with those, and it finds that 100% of the differences in success come from small bills.

That doesn’t mean that large bills don’t matter! It means that large bills don’t matter given that you got admission to the store. Large bills DO matter, because otherwise you wouldn’t have gotten in!

Similarly, this study’s results don’t mean that ACT doesn’t matter. They mean that ACT doesn’t matter given that you got admission to the college. If college admission criteria include ACT, then ACT does matter, because otherwise you wouldn’t have gotten in!

• I realized I forgot to provide evidence from the paper that the range of ACT within colleges is smaller than the range of GDP.

From p.207 of the paper:

“Thus, ACT scores are related to college graduation, in part, because students with higher scores are more likely to attend the kinds of colleges where students are more likely to graduate...”

(I think they obviously have this backwards, for the most part. Seems to me more likely that the higher graduation rates of those “kinds of colleges” are the ones that choose students with the higher ACT scores.)

From p. 206:

“Many schools do not have students with very high ACT scores, and a number of schools do not have students with very low ACT scores [which explains why some colleges do not have students from the full ACT range, even though they do have students from the full GPA range].”

In other words: students DO sort themselves into schools based on ACT score more than they do by GPA.

• Correction to above: the quote from p. 206 refers to high schools, not colleges.

For colleges, I found a page here that lists 25th and 75th ACT percentiles. Some pairs of schools have no overlap at all; for instance, Ohio State’s middle interval is (27, 31), while Vanderbilt is (32, 35). The average for college enrolees, per this study, was 20.1, with an SD of 4.33. So Vanderbilt’s 25th percentile is almost +3 SD.

For GPA … the 25th percentile for Vanderbilt is 3.75. The mean in this study was 2.72, with an SD of 0.65. So the 25th percentile for GPA was only around +1.6 SD.

For ACE at Vanderbilt, the 75th percentile is 0.92 SD higher than the 25th. If the same was true for GPA, the 75th percentile would have to be 4.34, which is clearly impossible, since the upper limit is 4.00.

So that supports the idea that for a given school, ACE has a narrower range than GPA.

• Here, there’s minimal dependence on ACT, but a negative dependence on , meaning that extreme ACT scores (high or low) both lead to lower likely-to-graduate scores.

Does that seem counterintuitive to you? Remember, we are taking a student who is already enrolled in a particular known college and predicting how likely that are to graduate from that college.

Sounds like a classic example of Simpson’s paradox, no?

• Where are the footnotes?

• I’m not very familiar with academia, but have you considered sending this to the authors of the paper to a) see if there are any mistakes you made and b) help them avoid similar errors in the future?
But I acknowledge that this could lead to a long email exchange that you may not want.

• I’ve politely contacted them several times via several different channels just asking for clarifications and what the “missing coefficients” are in the last model. Total stonewall- they won’t even acknowledge my contacts. Some people more connected to the education community also apparently did that as a result of my post, with the same result.