But neither of those are particularly compelling reasons for disagreement—can anyone more familiar with the psychological/statistical territory shed some light?
Shalizi’s most basic point — that factor analysis will generate a general factor for any bunch of sufficiently strongly correlated variables — is correct.
Here’s a demo. The statistical analysis package R comes with some built-in datasets to play with. I skimmed through the list and picked out six monthly datasets (72 data points in each):
atmospheric CO2 concentrations, 1959-1964
female UK lung deaths, 1974-1979
international airline passengers, 1949-1954
sunspot counts, 1749-1754
car drivers killed & seriously injured in Great Britain, 1969-1974
It’s pretty unlikely that there’s a single causal general factor that explains most of the variation in all six of these time series, especially as they’re from mostly non-overlapping time intervals. They aren’t even that well correlated with each other: the mean correlation between different time series is −0.10 with a std. dev. of 0.34. And yet, when I ask R’s canned factor analysis routine to calculate a general factor for these six time series, that general factor explains 1⁄3 of their variance!
However, Shalizi’s blog post covers a lot more ground than just this basic point, and it’s difficult for me to work out exactly what he’s trying to say, which in turn makes it difficult to say how correct he is overall. What does Shalizi mean specifically by calling g a myth? Does he think it is very unlikely to exist, or just that factor analysis is not good evidence for it? Who does he think is in error about its nature? I can think of one researcher in particular who stands out as just not getting it, but beyond that I’m just not sure.
What ought we discuss if not evidence?