Poking around on Cosma Shalizi’s website, I found this long, somewhat technical argument for why the general intelligence factor, g, doesn’t exist.
The main thrust is that g is an artifact of hierarchal factor analysis, and that whenever you have groups of variables that have positive correlations between them, a general factor will always appear that explains a fair amount of the variance, whether it a actually exists or not.
I’m not convinced, mainly because it strikes me as unlikely that an error of this type would persist for so long, and that even his conception of intelligence as a large number of separate abilities would need some sort of high level selection and sequencing function. But neither of those are particularly compelling reasons for disagreement—can anyone more familiar with the psychological/statistical territory shed some light?
I pointed this out to my buddy who’s a psychology doctoral student, his reply is below:
I don’t know enough about g to say whether the people talking about it are falling prey to the general correlation between tests, but this phenomenon is pretty well-known to social science researchers.
I do know enough about CFA and EFA to tell you that this guy has an unreasonable boner for CFA. CFA doesn’t test against truth, it tests against other models. Which means it only tells you whether the model you’re looking at fits better than a comparator model. If that’s a null model, that’s not a particularly great line of analysis.
He pretty blatantly misrepresents this. And his criticisms of things like Big Five are pretty wild. Big Five, by its very nature, fits the correlations extremely well. The largest criticism of Big Five is that it’s not theory-driven, but data-driven!
But my biggest beef has got to be him arguing that EFA is not a technique for determining causality. No shit. That is the very nature of EFA—it’s a technique for loading factors (which have no inherent “truth” to them by loading alone, and are highly subject to reification) in order to maximize variance explained. He doesn’t need to argue this point for a million words. It’s definitional.
So regardless of whether g exists or not, which I’m not really qualified to speak on, this guy is kind of a hugely misleading writer. MINUS FIVE SCIENCE POINTS TO HIM.
I think this is one of the few cases where Shalizi is wrong. (Not an easy thing to say, as I’m a big fan of his.)
In the second part of the article he generates synthetic “test scores” of people who have three thousand independent abilities—“facets of intelligence” that apply to different problems—and demonstrates that standard factor analysis still detects a strong single g-factor explaining most of the variance between people. From that he concludes that g is a “statistical artefact” and lacks “reality”. This is exactly like saying the total weight of the rockpile “lacks reality” because the weights of individual rocks are independent variables.
As for the reason why he is wrong, it’s pretty clear: Shalizi is a Marxist (fo’ real) and can’t give an inch to those pesky racists. A sad sight, that.
Indeed. A while ago, I got intensely interested in these controversies over intelligence research, and after reading a whole pile of books and research papers, I got the impression that there is some awfully bad statistics being pushed by pretty much every side in the controversy, so at the end I was left skeptical towards all the major opposing positions (though to varying degrees). If there existed a book written by someone as smart and knowledgeable as Shalizi that would present a systematic, thorough, and unbiased analysis of this whole mess, I would gladly pay $1,000 for it. Alas, Shalizi has definitely let his ideology get the better of him this time.
He also wrote an interesting long post on the heritability of IQ, which is better, but still clearly slanted ideologically. I recommend reading it nevertheless, but to get a more accurate view of the whole issue, I recommend reading the excellent Making Sense of Heritability by Neven Sesardić alongside it.
If there existed a book written by someone as smart and knowledgeable as Shalizi that would present a systematic, thorough, and unbiased analysis of this whole mess, I would gladly pay $1,000 for it.
There is no such book (yet), but there are two books that cover the most controversial part of the mess that I’d recommend: Race Differences in Intelligence (1975) and Race, IQ and Jensen (1980). They are both systematic, thorough, and about as unbiased as one can reasonably expect on the subject of race & IQ. On the down side, they don’t really cover other aspects of the IQ controversies, and they’re three decades out of date. (That said, I personally think that few studies published since 1980 bear strongly on the race & IQ issue, so the books’ age doesn’t matter that much.)
Yes, among the books on the race-IQ controversy that I’ve seen, I agree that these are the closest thing to an unbiased source. However, I disagree that nothing very significant has happened in the field since their publication—although unfortunately, taken together, these new developments have led to an even greater overall confusion. I have in mind particularly the discovery of the Flynn effect and the Minnesota adoption study, which have made it even more difficult to argue coherently either for a hereditarian or an environmentalist theory the way it was done in the seventies.
Also, even these books fail to present a satisfactory treatment of some basic questions where a competent statistician should be able to clarify things fully, but horrible confusion has nevertheless persisted for decades. Here I refer primarily to the use of the regression to the mean as a basis for hereditarian arguments. From what I’ve seen, Jensen is still using such arguments as a major source of support for his positions, constantly replying to the existing superficial critiques with superficial counter-arguments, and I’ve never seen anyone giving this issue the full attention it deserves.
However, I disagree that nothing very significant has happened in the field since their publication
Me too! I just don’t think there’s been much new data brought to the table. I agree with you in counting Flynn’s 1987 paper and the Minnesota followup report, and I’d add Moore’s 1986 study of adopted black children, the recent meta-analyses by Jelte Wicherts and colleagues on the mean IQs of sub-Saharan Africans, Dickens & Flynn’s 2006 paper on black Americans’ IQs converging on whites’ (and at a push, Rushton & Jensen’s reply along with Dickens & Flynn’s), Fryer & Levitt’s 2007 paper about IQ gaps in young children, and Fagan & Holland’s papers (200200080-6), 2007, 2009) on developing tests where minorities score equally to whites. I guess Richard Lynn et al.’s papers on the mean IQ of East Asians count as well, although it’s really the black-white comparison that gets people’s hackles up.
Having written out a list, it does looks longer than I expected...although it’s not much for 30-35 years of controversy!
Also, even these books fail to present a satisfactory treatment of some basic questions where a competent statistician should be able to clarify things fully, but horrible confusion has nevertheless persisted for decades. Here I refer primarily to the use of the regression to the mean as a basis for hereditarian arguments.
Amen. The regression argument should’ve been dropped by 1980 at the latest. In fairness to Flynn, his book does namecheck that argument and explain why it’s wrong, albeit only briefly.
The regression argument should’ve been dropped by 1980 at the latest. In fairness to Flynn, his book does namecheck that argument and explain why it’s wrong, albeit only briefly.
If I remember correctly, Loehlin’s book also mentions it briefly. However, it seems to me that the situation is actually more complex.
Jensen’s arguments, in the forms in which he has been stating them for decades, are clearly inadequate. Some very good responses were published 30+ years ago by Mackenzie and Furby. Yet for some bizarre reason, prominent critics of Jensen have typically ignored these excellent references and instead produced their own much less thorough and clear counterarguments.
Nevertheless, I’m not sure if the argument should end here. Certainly, if we observe a subpopulation S in which the values of a trait follow a normal distribution with the mean M(S) that is lower than for the whole population, then in pairs of individuals from S among whom there exists a correlation independent of rank and smaller than one, the lower-ranked individuals will regress towards M(S). That’s a mathematical tautology, and nothing can be inferred from it about what the causes of the individual and group differences might be; the above cited papers explain this fact very well.
However, the question that I’m not sure about is: what can we conclude from the fact that the existing statistical distributions and correlations are such that they satisfy these mathematical conditions? Is this really a trivial consequence of the norming of tests that’s engineered so as to give their scores a normal distribution over the whole population? I’d like to see someone really statistics-savvy scrutinize the issue without starting from the assumption that both the total population distribution and the subpopulation distribution are normal and that the correlation coefficients between relatives are independent of their rank in the distribution.
Well, if you’ll excuse the ugly metaphor, in this area even the positive questions are giant cans of worms lined on top of third rails, so I really have no desire to get into public discussions of normative policy issues.
Can you point to specific parts of that post which are in error owing to ideologically motivated thinking?
A piece of writing biased for ideological reasons doesn’t even have to have any specific parts that can be shown to be in error per se. Enormous edifices of propaganda can be constructed—and have been constructed many times in history—based solely on the selection and arrangement of the presented facts and claims, which can all be technically true by themselves.
In areas that arouse strong ideological passions, all sorts of surveys and other works aimed at broad audiences can be expected to suffer from this sort of bias. For a non-expert reader, this problem can be recognized and overcome only by reading works written by people espousing different perspectives. That’s why I recommend that people should read Shalizi’s post on heritability, but also at least one more work addressing the same issues written by another very smart author who doesn’t share the same ideological position. (And Sesardić′s book is, to my knowledge, the best such reference about this topic.)
Instead of getting into a convoluted discussion of concrete points in Shalizi’s article, I’ll just conclude with the following remark. You can read Shalizi’s article, conclude that it’s the definitive word on the subject, and accept his view of the matter. But you can also read more widely on the topic, and see that his presentation is far from unbiased, even if you ultimately conclude that his basic points are correct. The relevant literature is easily accessible if you just have internet and library access.
The weight of the rock pile is just what we call the sum of the weights of the rocks. It’s just a definition; but the idea of general intelligence is more than a definition. If there were a real, biological thing called g, we would expect all kinds of abilities to be correlated. Intelligence would make you better at math and music and English. We would expect basically all cognitive abilities to be affected by g, because g is real—it represents something like dendrite density, some actual intelligence-granting property.
People hypothesized that g is real because results of all kinds of cognitive tests are correlated. But what Shalizi showed is that you can generate the same correlations if you let test scores depend on three thousand uncorrelated abilities. You can get the same results as the IQ advocates even when absolutely no single factor determines different abilities.
Sure, your old g will correlate with multiple abilities—hell, you could let g = “test score” and that would correlate with all the abilities—but that would be meaningless.
If size and location determine the price of a house, you don’t declare that there is some factor that causes both large size and desirable location!
But what Shalizi showed is that you can generate the same correlations if you let test scores depend on three thousand uncorrelated abilities. You can get the same results as the IQ advocates even when absolutely no single factor determines different abilities.
Just to be clear, this is not an original idea by Shalizi, but the well known “sampling theory” of general intelligence first proposed by Godfrey Thomson almost a century ago. Shalizi states this very clearly in the post, and credits Thomson with the idea. However, for whatever reason, he fails to mention the very extensive discussions of this theory in the existing literature, and writes as if Thomson’s theory had been ignored ever since, which definitely doesn’t represent the actual situation accurately.
In a recent paper by van der Maas et al., which presents an extremely interesting novel theory of correlations that give rise to g (and which Shalizi links to at one point), the authors write:
Thorndike (1927) and Thomson (1951) proposed one such alternative mechanism, namely, sampling. In this sampling theory, carrying out cognitive tasks requires the use of many lower order uncorrelated modules or neural processes (so-called bonds). They hypothesized that the samples of modules or bonds used for different cognitive tests partly overlap, causing a positive correlation between the test scores. In this view, the positive manifold is due to a measurement problem in the sense that it is very difficult to obtain independent measures of the lower order processes. Jensen (1998) and Eysenck (1987) identified three problems with this sampling theory. First, whereas some complex mental tests, as predicted by sampling theory, highly load on the g factor, some very narrowly defined tests also display high g loadings. Second, some seemingly completely unrelated tests, such as visual and memory scan tasks, are consistently highly correlated, whereas related tests, such as forward and backward digit span, are only modestly correlated. Third, in some cases brain damage leads to very specific impairments, whereas sampling theory predicts general impairments. These three facts are difficult to explain with sampling theory, which as a consequence has not gained much acceptance.1 Thus, the g explanation remains very dominant in the current literature (see Jensen, 1998, p. 107).
Note that I take no position here about whether these criticisms of the sampling theory are correct or not. However, I think this quote clearly demonstrates that an attempt to write off g by merely invoking the sampling theory is not a constructive contribution to the discussion.
I would also add that if someone managed to construct multiple tests of mental ability that would sample disjoint sets of Thomsonesque underlying abilities and thus fail to give rise to g, it would be considered a tremendous breakthrough. Yet, despite the strong incentive to achieve this, nobody who has tried so far has succeeded. This evidence is far from conclusive, but far from insignificant either.
I think Shalizi isn’t too far off the mark in writing “as if Thomson’s theory had been ignored”. Although a few psychologists & psychometricians have acknowledged Thomson’s sampling model, in everyday practice it’s generally ignored. There are far more papers out there that fit g-oriented factor models as a matter of course than those that try to fit a Thomson-style model. Admittedly, there is a very good reason for that — Thomson-style models would be massively underspecified on the datasets available to psychologists, so it’s not practical to fit them — but that doesn’t change the fact that a g-based model is the go-to choice for the everyday psychologist.
There’s an interesting analogy here to Shalizi’s post about IQ’s heritability, now I think about it. Shalizi writes it as if psychologists and behaviour geneticists don’t care about gene-environment correlation, gene-environment interaction, nonlinearities, there not really being such a thing as “the” heritability of IQ, and so on. One could object that this isn’t true — there are plenty of papers out there concerned with these complexities — but on the other hand, although the textbooks pay lip service to them, researchers often resort to fitting models that ignore these speedbumps. The reason for this is the same as in the case of Thomson’s model: given the data available to scientists, models that accounted for these effects would usually be ruinously underspecified. So they make do.
However, it seems to me that the fatal problem of the sampling theory is that nobody has ever managed to figure out a way to sample disjoint sets of these hypothetical uncorrelated modules. If all practically useful mental abilities and all the tests successfully predicting them always sample some particular subset of these modules, then we might as well look at that subset as a unified entity that represents the causal factor behind g, since its elements operate together as a group in all relevant cases.
Or is there some additional issue here that I’m not taking into account?
I can’t immediately think of any additional issue. It’s more that I don’t see the lack of well-known disjoint sets of uncorrelated cognitive modules as a fatal problem for Thomson’s theory, merely weak disconfirming evidence. This is because I assign a relatively low probability to psychologists detecting tests that sample disjoint sets of modules even if they exist.
For example, I can think of a situation where psychologists & psychometricians have missed a similar phenomenon: negatively correlated cognitive tests. I know of a couple of examples which I found only because the mathematician Warren D. Smith describes them in his paper “Mathematical definition of ‘intelligence’ (and consequences)”. The paper’s about the general goal of coming up with universal definitions of and ways to measure intelligence, but in the middle of it is a polemical/sceptical summary of research into g & IQ.
Smith went through a correlation matrix for 57 tests given to 240 people, published by Thurstone in 1938, and saw that the 3 most negative of the 1596 intercorrelations were between these pairs of tests:
“100-word vocabulary test // Recognize pictures of hand as Right/Left” (correlation = −0.22)
“Find lots of synonyms of a given word // Decide whether 2 pictures of a national flag are relatively mirrored or not” (correlation = −0.16)
“Describe somebody in writing: score=# words used // figure recognition test: decide which numbers in a list of drawings of abstract figures are ones you saw in a previously shown list” (correlation = −0.12)
In Smith’s words: “This seems too much to be a coincidence!” Smith then went to the 60-item correlation matrix for 710 schoolchildren published by Thurstone & Thurstone in 1941 and did the same, discovering that
the three most negative [correlations], with values −0.161, −0.152, and −0.138 respectively, are the pairwise correlations of the performance on the “scattered Xs” test (circle the Xs in a random scattering of letters) with these three tests: (a) Sentence completion … (b) Reading comprehension II … (c) Reading comprehension I … Again, it is difficult to believe this also is a coincidence!
The existence of two pairs of negatively correlated cognitive skills leads me to increase my prior for the existence of uncorrelated cognitive skills.
Also, the way psychologists often analyze test batteries makes it harder to spot disjoint sets of uncorrelated modules. Suppose we have a 3-test battery, where test 1 samples uncorrelated modules A, B, C, D & E, test 2 samples F, G, H, I & J, and test 3 samples C, D, E, F & G. If we administer the battery to a few thousand people and extract a g from the results, as is standard practice, then by construction the resulting g is going to correlate with scores on tests 1 & 2, although we know they sample non-overlapping sets of modules. (IQ, being a weighted average of test/module scores, will also correlate with all of the tests.) A lot of psychologists would interpret that as evidence against tests 1 & 2 measuring distinct mental abilities, even though we see there’s an alternative explanation.
Even if we did find an index of intelligence that didn’t correlate with IQ/g, would we count it as such? Duckworth & Seligman discovered that in a sample of 164 schoolchildren, a composite measure of self-discipline predicted GPA significantly better than IQ, and self-discipline didn’t correlate significantly with IQ. Does self-discipline now count as an independent intellectual ability? I’d lean towards saying it doesn’t, but I doubt I could justify being dogmatic about that; it’s surely a cognitive ability in the term’s broadest sense.
For example, I can think of a situation where psychologists & psychometricians have missed a similar phenomenon: negatively correlated cognitive tests. I know of a couple of examples which I found only because the mathematician Warren D. Smith describes them in his paper “Mathematical definition of ‘intelligence’ (and consequences)”.
That’s an extremely interesting reference, thanks for the link! This is exactly the kind of approach that this area desperately needs: no-nonsense scrutiny by someone with a strong math background and without an ideological agenda.
David Hilbert allegedly once quipped that physics is too important to be left to physicists; the way things are, it seems to me that psychometrics should definitely not be left to psychologists. That they haven’t immediately rushed to explore further these findings by Smith is an extremely damning fact about the intellectual standards in the field.
Duckworth & Seligman discovered that in a sample of 164 schoolchildren, a composite measure of self-discipline predicted GPA significantly better than IQ, and self-discipline didn’t correlate significantly with IQ. Does self-discipline now count as an independent intellectual ability?
Wouldn’t this closely correspond to the Big Five “conscientiousness” trait? (Which the paper apparently doesn’t mention at all?!) From what I’ve seen, even among the biggest fans of IQ, it is generally recognized that conscientiousness is at least similarly important as general intelligence in predicting success and performance.
Wouldn’t this closely correspond to the Big Five “conscientiousness” trait? (Which the paper apparently doesn’t mention at all?!) From what I’ve seen, even among the biggest fans of IQ, it is generally recognized that conscientiousness is at least similarly important as general intelligence in predicting success and performance.
That’s an excellent point that completely did not occur to me. Turns out that self-discipline is actually one of the 6 subscales used to measure conscientiousness on the NEO-PI-R, so it’s clearly related to conscientiousness. With that in mind, it is a bit weird that conscientiousness doesn’t get a shoutout in the paper...
Is anything known about a physical basis for conscientiousness?
It can be reliably predicted by, for example, SPECT scans. If I recall correctly you can expect to see over-active frontal lobes and basal ganglia. For this reason (and because those areas depend on dopamine a lot) dopaminergics (Ritalin, etc) make a big difference.
I haven’t looked at Smith yet, but the quote looks like parody to me. Since you seem to take it seriously, I’ll respond. Awfully specific tests defying the predictions looks like data mining to me. I predict that these negative correlations are not replicable. The first seems to be the claim that verbal ability is not correlated with spatial ability, but this is a well-tested claim. As Shalizi mentions, psychometricians do look for separate skills and these are commonly accepted components. I wouldn’t be terribly surprised if there were ones they completely missed, but these two are popular and positively correlated. The second example is a little more promising: maybe that scattered Xs test is independent of verbal ability, even though it looks like other skills that are not, but I doubt it.
With respect to self-discipline, I think you’re experiencing some kind of halo effect. Not every positive mental trait should be called intelligence. Self-discipline is just not what people mean by intelligence. I knew that conscientiousness predicted GPAs, but I’d never heard such a strong claim. But it is true that a lot of people dismiss conscientiousness (and GPA) in favor of intelligence, and they seem to be making an error (or being risk-seeking).
I haven’t looked at Smith yet, but the quote looks like parody to me. Since you seem to take it seriously, I’ll respond.
Once you read the relevant passage in context, I anticipate you will agree with me that Smith is serious. Take this paragraph from before the passage I quoted from:
Further, let us return to Gould’s criticism that due to “validation” of most other highly used IQ tests and subtests, Spearman’s g was forced to appear to exist from then on, regardless of whether it actually did. In view of this … probably the only place we can look in the literature to find data truly capable of refuting or confirming Spearman, is data from the early days, before too much “validation” occurred, but not so early on that Spearman’s atrocious experimental and statistical practices were repeated.
The prime candidate I have been able to find for such data is Thurstone’s [205] “primary mental abilities” dataset published in 1938.
Smith then presents the example from Thurstone’s 1938 data.
Awfully specific tests defying the predictions looks like data mining to me. I predict that these negative correlations are not replicable.
I’d be inclined to agree if the 3 most negative correlations in the dataset had come from very different pairs of tests, but the fact that they come from sets of subtests that one would expect to tap similar narrow abilities suggests they’re not just statistical noise.
The first seems to be the claim that verbal ability is not correlated with spatial ability, but this is a well-tested claim. As Shalizi mentions, psychometricians do look for separate skills and these are commonly accepted components. I wouldn’t be terribly surprised if there were ones they completely missed, but these two are popular and positively correlated.
Smith himself does not appear to make that claim; he presents his two examples merely as demonstrations that not all mental ability scores positively correlate. I think it’s reasonable to package the 3 verbal subtests he mentions as strongly loading on verbal ability, but it’s not clear to me that the 3 other subtests he pairs them with are strong measures of “spatial ability”; two of them look like they tap a more specific ability to handle mental mirror images, and the third’s a visual memory test.
Even if it transpires that the 3 subtests all tap substantially into spatial ability, they needn’t necessarily correlate positively with specific measures of verbal ability, even though verbal ability correlates with spatial ability.
With respect to self-discipline, I think you’re experiencing some kind of halo effect. Not every positive mental trait should be called intelligence. Self-discipline is just not what people mean by intelligence.
I’m tempted to agree but I’m not sure such a strong generalization is defensible. Take a list of psychologists’ definitions of intelligence. IMO self-discipline plausibly makes sense as a component of intelligence under definitions 1, 7, 8, 13, 14, 23, 25, 26, 27, 28, 32, 33 & 34, which adds up to 37% of the list of definitions. A good few psychologists appear to include self-discipline as a facet of intelligence.
Even if we did find an index of intelligence that didn’t correlate with IQ/g, would we count it as such? Duckworth & Seligman discovered that in a sample of 164 schoolchildren, a composite measure of self-discipline predicted GPA significantly better than IQ, and self-discipline didn’t correlate significantly with IQ. Does self-discipline now count as an independent intellectual ability? I’d lean towards saying it doesn’t, but I doubt I could justify being dogmatic about that; it’s surely a cognitive ability in the term’s broadest sense.
This finding is consistent with the folk notion of “crazy geniuses.”
Though it’s important to note that the second study was done on college students, who must have a certain level of IQ and who aren’t representative of the population.
The first study notes:
According to this proposal the significant negative correlation could be observed only in groups with above average mental abilities and not in a random sample from a general population.
If we took a larger sample of the population, including lower IQ individuals, then I think we would see the negative correlation between Conscientiousness and intelligence diminish or even reverse, because I bet there are lots of people outside a college population who have both low intelligence and low Conscientiousness.
It could be that a moderate amount of Conscientiousness (well, whatever mechanisms cause Conscientiousness) is necessary for above average intelligence, but too much Conscientiousness (i.e. those mechanisms are too strong) limits intelligence.
I noticed a while back when a bunch of LW’ers gave their Big Five scores that our Conscientiousness scores tended to be low. I took that to be an internet thing (people currently reading a website are more likely to be lazy slobs) but this is a more flattering explanation.
Interesting. I would’ve expected Conscientiousness to correlate weakly positively with IQ across most IQ levels.
I would avoid interpreting a negative correlation between C/self-discipline and IQ as evidence against C/self-discipline being a separate facet of intelligence; I think that would beg the question by implicitly assuming that IQ’s representing the entirety of what we call intelligence.
If only! I’m just a physics student but I’ve read a few books and quite a few articles about IQ.
[Edit: I’ve got an amateur interest in statistics as well, which helps a lot on this subject. Vladimir_M is right that there’s a lot of crap statistics peddled in this field.]
“All of this, of course, is completely compatible with IQ having some ability, when plugged into a linear regression, to predict things like college grades or salaries or the odds of being arrested by age 30. (This predictive ability is vastly less than many people would lead you to believe [cf.], but I’m happy to give them that point for the sake of argument.) This would still be true if I introduced a broader mens sana in corpore sano score, which combined IQ tests, physical fitness tests, and (to really return to the classical roots of Western civilization) rated hot-or-not sexiness. Indeed, since all these things predict success in life (of one form or another), and are all more or less positively correlated, I would guess that MSICS scores would do an even better job than IQ scores. I could even attribute them all to a single factor, a (for arete), and start treating it as a real causal variable. By that point, however, I’d be doing something so obviously dumb that I’d be accused of unfair parody and arguing against caricatures and straw-men.”
This is the point here. There’s a difference between coming up with linear combinations and positing real, physiological causes.
My beef isn’t with Shalizi’s reasoning, which is correct. I disagree with his text connotationally. Calling something a “myth” because it isn’t a causal factor and you happen to study causal factors is misleading. Most people who use g don’t need it to be a genuine causal factor; a predictive factor is enough for most uses, as long as we can’t actually modify dendrite density in living humans or something like that.
If g is a causal factor then “A has higher g than B” adds additional information to the statement “A scored higher than B on such-and-such tests.” It might mean, for instance, that you could look in A’s brain and see different structure than in B’s brain; it might mean that we would expect A to be better at unrelated, previously untested skills.
If g is not a causal factor, then comments about g don’t add any new information; they just sort of summarize or restate. That difference is significant.
A predictive factor is enough for predictive uses, but not for a lot of policy uses, which rely on causality. From your comment, I assume you are not a lefty, and that you think we should be more confident than we are about using IQ to make decisions regarding race. I think that Shalizi’s reasoning is likely not irrelevant to making those decisions; it should probably make us more guarded in practice.
I don’t understand your last paragraph. Could you give an example? Is this relevant to the decision of whether intelligence tests should be used for choosing firemen? or is that a predictive use?
The kinds of implications I’m thinking about are that if IQ causes X, (and if IQ is heritable) then we should not seek to change X by social engineering means, because it won’t be possible. X could be the distribution of college admittees, firemen, criminals, etc.
Not all policy has to rely on causal factors, of course. And my thinking is a little blurry on these issues in general.
The way you define “real” properties, it seems you can’t tell them from “unreal” ones by looking at correlations alone; we need causal intervention for that, a la Pearl. So until we invent tech for modifying dendrite density of living humans, or something like that, there’s no practical difference between “real” g and “unreal” g and no point in making the distinction between them. In particular, their predictive power is the same.
So, basically, your and Shalizi’s demand for a causal factor is too strong. We can do with weaker tools.
But neither of those are particularly compelling reasons for disagreement—can anyone more familiar with the psychological/statistical territory shed some light?
Shalizi’s most basic point — that factor analysis will generate a general factor for any bunch of sufficiently strongly correlated variables — is correct.
Here’s a demo. The statistical analysis package R comes with some built-in datasets to play with. I skimmed through the list and picked out six monthly datasets (72 data points in each):
It’s pretty unlikely that there’s a single causal general factor that explains most of the variation in all six of these time series, especially as they’re from mostly non-overlapping time intervals. They aren’t even that well correlated with each other: the mean correlation between different time series is −0.10 with a std. dev. of 0.34. And yet, when I ask R’s canned factor analysis routine to calculate a general factor for these six time series, that general factor explains 1⁄3 of their variance!
However, Shalizi’s blog post covers a lot more ground than just this basic point, and it’s difficult for me to work out exactly what he’s trying to say, which in turn makes it difficult to say how correct he is overall. What does Shalizi mean specifically by calling g a myth? Does he think it is very unlikely to exist, or just that factor analysis is not good evidence for it? Who does he think is in error about its nature? I can think of one researcher in particularwho stands out as just not getting it, but beyond that I’m just not sure.
In your example, we have no reason to privilege the hypothesis that there is an underlying causal factor behind that data. In the case of g, wouldn’t its relationships to neurobiology be a reason to give a higher prior probability to the hypothesis that g is actually measuring something real? These results would seem surprising if g was merely a statistical “myth.”
In the case of g, wouldn’t its relationships to neurobiology be a reason to give a higher prior probability to the hypothesis that g is actually measuring something real?
The best evidence that g measures something real is that IQ tests are highly reliable, i.e. if you get your IQ or g assessed twice, there’s a very good correlation between your first score and your second score. Something has to generate the covariance between retestings; that g & IQ also correlate with neurobiological variables is just icing on the cake.
To answer your question directly, g’s neurobiological associations are further evidence that g measures something real, and I believe g does measure something real, though I am not sure what.
These results would seem surprising if g was merely a statistical “myth.”
Shalizi is, somewhat confusingly, using the word “myth” to mean something like “g’s role as a genuine physiological causal agent is exaggerated because factor analysis sucks for causal inference”, rather than its normal meaning of “made up”. Working with Shalizi’s (not especially clear) meaning of the word “myth”, then, it’s not that surprising that g correlates with neurobiology, because it is measuring something — it’s just not been proven to represent a single causal agent.
Personally I would’ve preferred Shalizi to use some word other than “myth” (maybe “construct”) to avoid exactly this confusion: it sounds as if he’s denying that g measures anything, but I don’t believe that’s his intent, nor what he actually believes. (Though I think there’s a small but non-negligible chance I’m wrong about that.)
From what I can gather, he’s saying all other evidence points to a large number of highly specialized mental functions instead of one general intelligence factor, and that psychologists are making a basic error by not understanding how to apply and interpret the statistical tests they’re using. It’s the latter which I find particularly unlikely (not impossible though).
You might be right. I’m not really competent to judge the first issue (causal structure of the mind), and the second issue (interpretation of factor analytic g) is vague enough that I could see myself going either way on it.
Belatedly: Economic development (including population growth?) is related to CO2, lung deaths, international airline passengers, average air temperatures (through global warming), and car accidents.
I don’t think it’s surprising that an untenable claim could persist within a field for a long time, once established. Pluto was called a planet for seventy-six years.
I’ve no idea whether the critique of g is accurate, however.
That’s a bizarre choice of example. The question of whether Pluto is a planet is entirely a definitional one; the IAU could make it one by fiat if they chose. There’s no particular reason for it not to be one, except that the IAU felt the increasing number of tranNeptunian objects made the current definition awkward.
Once upon a time it was thought that the word “fish” included dolphins. Now you could play the oh-so-clever arguer, and say, “The list: {Salmon, guppies, sharks, dolphins, trout} is just a list—you can’t say that a list is wrong. I can prove in set theory that this list exists. So my definition of fish, which is simply this extensional list, cannot possibly be ‘wrong’ as you claim.”
Or you could stop playing nitwit games and admit that dolphins don’t belong on the fish list.
Honestly, it would make the most sense to draw four lists, like the Hayden Planetarium did, with rocky planets, asteroids, gas giants, and Kuiper Belt objects each in their own category, but it is obviously wrong to include everything from Box 1 and Box 3 and one thing from Box 4. The only reason it was done is because they didn’t know better and didn’t want to change until they had to.
You (well, EY) make a good point, but I think neither the Pluto remark nor the fish one is actually an example of this.
In the case of Pluto, the transNeptunians and the other planets seem to belong in a category that the asteroids don’t. They’re big and round! Moreover, they presumably underwent a formation process that the asteroid belt failed too complete in the same way (or whatever the current theory of formation of the asteroid belt is; I think that it involves failure to form a “planet” due to tidal forces from Jupiter?). Of course there are border cases like Ceres, but I think there is a natural category (whatever that means!) that includes the rocky planets, gas giants and Kuiper Belt objects that does not include (most) asteroids and comets.
On the fish example, I claim that the definition of “fish” that includes the modern definition of fish union the cetaceans is a perfectly valid natural category, and that this is therefore an intensional definition. “Fish” are all things that live in the water, have finlike or flipperlike appendages and are vaguely hydrodynamic. The fact that such things do not all share a comment descent* is immaterial to the fact that they look the same and act the same at first glance. As human knowledge has increased, we have made a distinction between fish and things that look like fish but aren’t, but we reasonably could have kept the original definition of fish and called the scientific concept something else, say “piscoids”.
*well, actually they do, but you know approximately what I mean.
Nitpick: if in your definition of fish, you mean that they need to both have fins or flippers and be (at least) vaguely hydrodynamic, I don’t think seahorses and puffer fish qualify.
Yes, but neither fish nor (fish union cetaceans) is monphylatic. The decent tree rooted at the last common ancestor of fish also contains tetrapods and decent tree rooted at the last common ancestor of tetrapods contains the cetaceans.
I am not any sort of biologist, so I am unclear on the terminological technicalties, which is why I handwaved this in my post above.
I’m inclined to agree. Having a name for ‘things that naturally swim around in the water, etc’ is perfectly reasonable and practical. It is in no way a nitwit game.
Poking around on Cosma Shalizi’s website, I found this long, somewhat technical argument for why the general intelligence factor, g, doesn’t exist.
The main thrust is that g is an artifact of hierarchal factor analysis, and that whenever you have groups of variables that have positive correlations between them, a general factor will always appear that explains a fair amount of the variance, whether it a actually exists or not.
I’m not convinced, mainly because it strikes me as unlikely that an error of this type would persist for so long, and that even his conception of intelligence as a large number of separate abilities would need some sort of high level selection and sequencing function. But neither of those are particularly compelling reasons for disagreement—can anyone more familiar with the psychological/statistical territory shed some light?
I pointed this out to my buddy who’s a psychology doctoral student, his reply is below:
I think this is one of the few cases where Shalizi is wrong. (Not an easy thing to say, as I’m a big fan of his.)
In the second part of the article he generates synthetic “test scores” of people who have three thousand independent abilities—“facets of intelligence” that apply to different problems—and demonstrates that standard factor analysis still detects a strong single g-factor explaining most of the variance between people. From that he concludes that g is a “statistical artefact” and lacks “reality”. This is exactly like saying the total weight of the rockpile “lacks reality” because the weights of individual rocks are independent variables.
As for the reason why he is wrong, it’s pretty clear: Shalizi is a Marxist (fo’ real) and can’t give an inch to those pesky racists. A sad sight, that.
cousin_it:
Indeed. A while ago, I got intensely interested in these controversies over intelligence research, and after reading a whole pile of books and research papers, I got the impression that there is some awfully bad statistics being pushed by pretty much every side in the controversy, so at the end I was left skeptical towards all the major opposing positions (though to varying degrees). If there existed a book written by someone as smart and knowledgeable as Shalizi that would present a systematic, thorough, and unbiased analysis of this whole mess, I would gladly pay $1,000 for it. Alas, Shalizi has definitely let his ideology get the better of him this time.
He also wrote an interesting long post on the heritability of IQ, which is better, but still clearly slanted ideologically. I recommend reading it nevertheless, but to get a more accurate view of the whole issue, I recommend reading the excellent Making Sense of Heritability by Neven Sesardić alongside it.
There is no such book (yet), but there are two books that cover the most controversial part of the mess that I’d recommend: Race Differences in Intelligence (1975) and Race, IQ and Jensen (1980). They are both systematic, thorough, and about as unbiased as one can reasonably expect on the subject of race & IQ. On the down side, they don’t really cover other aspects of the IQ controversies, and they’re three decades out of date. (That said, I personally think that few studies published since 1980 bear strongly on the race & IQ issue, so the books’ age doesn’t matter that much.)
Yes, among the books on the race-IQ controversy that I’ve seen, I agree that these are the closest thing to an unbiased source. However, I disagree that nothing very significant has happened in the field since their publication—although unfortunately, taken together, these new developments have led to an even greater overall confusion. I have in mind particularly the discovery of the Flynn effect and the Minnesota adoption study, which have made it even more difficult to argue coherently either for a hereditarian or an environmentalist theory the way it was done in the seventies.
Also, even these books fail to present a satisfactory treatment of some basic questions where a competent statistician should be able to clarify things fully, but horrible confusion has nevertheless persisted for decades. Here I refer primarily to the use of the regression to the mean as a basis for hereditarian arguments. From what I’ve seen, Jensen is still using such arguments as a major source of support for his positions, constantly replying to the existing superficial critiques with superficial counter-arguments, and I’ve never seen anyone giving this issue the full attention it deserves.
Me too! I just don’t think there’s been much new data brought to the table. I agree with you in counting Flynn’s 1987 paper and the Minnesota followup report, and I’d add Moore’s 1986 study of adopted black children, the recent meta-analyses by Jelte Wicherts and colleagues on the mean IQs of sub-Saharan Africans, Dickens & Flynn’s 2006 paper on black Americans’ IQs converging on whites’ (and at a push, Rushton & Jensen’s reply along with Dickens & Flynn’s), Fryer & Levitt’s 2007 paper about IQ gaps in young children, and Fagan & Holland’s papers (200200080-6), 2007, 2009) on developing tests where minorities score equally to whites. I guess Richard Lynn et al.’s papers on the mean IQ of East Asians count as well, although it’s really the black-white comparison that gets people’s hackles up.
Having written out a list, it does looks longer than I expected...although it’s not much for 30-35 years of controversy!
Amen. The regression argument should’ve been dropped by 1980 at the latest. In fairness to Flynn, his book does namecheck that argument and explain why it’s wrong, albeit only briefly.
satt:
If I remember correctly, Loehlin’s book also mentions it briefly. However, it seems to me that the situation is actually more complex.
Jensen’s arguments, in the forms in which he has been stating them for decades, are clearly inadequate. Some very good responses were published 30+ years ago by Mackenzie and Furby. Yet for some bizarre reason, prominent critics of Jensen have typically ignored these excellent references and instead produced their own much less thorough and clear counterarguments.
Nevertheless, I’m not sure if the argument should end here. Certainly, if we observe a subpopulation S in which the values of a trait follow a normal distribution with the mean M(S) that is lower than for the whole population, then in pairs of individuals from S among whom there exists a correlation independent of rank and smaller than one, the lower-ranked individuals will regress towards M(S). That’s a mathematical tautology, and nothing can be inferred from it about what the causes of the individual and group differences might be; the above cited papers explain this fact very well.
However, the question that I’m not sure about is: what can we conclude from the fact that the existing statistical distributions and correlations are such that they satisfy these mathematical conditions? Is this really a trivial consequence of the norming of tests that’s engineered so as to give their scores a normal distribution over the whole population? I’d like to see someone really statistics-savvy scrutinize the issue without starting from the assumption that both the total population distribution and the subpopulation distribution are normal and that the correlation coefficients between relatives are independent of their rank in the distribution.
What would appropriate policy be if we just don’t know to what extent IQ is different in different groups?
Well, if you’ll excuse the ugly metaphor, in this area even the positive questions are giant cans of worms lined on top of third rails, so I really have no desire to get into public discussions of normative policy issues.
OK, I’ll bite. Can you point to specific parts of that post which are in error owing to ideologically motivated thinking?
Morendil:
A piece of writing biased for ideological reasons doesn’t even have to have any specific parts that can be shown to be in error per se. Enormous edifices of propaganda can be constructed—and have been constructed many times in history—based solely on the selection and arrangement of the presented facts and claims, which can all be technically true by themselves.
In areas that arouse strong ideological passions, all sorts of surveys and other works aimed at broad audiences can be expected to suffer from this sort of bias. For a non-expert reader, this problem can be recognized and overcome only by reading works written by people espousing different perspectives. That’s why I recommend that people should read Shalizi’s post on heritability, but also at least one more work addressing the same issues written by another very smart author who doesn’t share the same ideological position. (And Sesardić′s book is, to my knowledge, the best such reference about this topic.)
Instead of getting into a convoluted discussion of concrete points in Shalizi’s article, I’ll just conclude with the following remark. You can read Shalizi’s article, conclude that it’s the definitive word on the subject, and accept his view of the matter. But you can also read more widely on the topic, and see that his presentation is far from unbiased, even if you ultimately conclude that his basic points are correct. The relevant literature is easily accessible if you just have internet and library access.
Your analogy is flawed, I think.
The weight of the rock pile is just what we call the sum of the weights of the rocks. It’s just a definition; but the idea of general intelligence is more than a definition. If there were a real, biological thing called g, we would expect all kinds of abilities to be correlated. Intelligence would make you better at math and music and English. We would expect basically all cognitive abilities to be affected by g, because g is real—it represents something like dendrite density, some actual intelligence-granting property.
People hypothesized that g is real because results of all kinds of cognitive tests are correlated. But what Shalizi showed is that you can generate the same correlations if you let test scores depend on three thousand uncorrelated abilities. You can get the same results as the IQ advocates even when absolutely no single factor determines different abilities.
Sure, your old g will correlate with multiple abilities—hell, you could let g = “test score” and that would correlate with all the abilities—but that would be meaningless. If size and location determine the price of a house, you don’t declare that there is some factor that causes both large size and desirable location!
SarahC:
Just to be clear, this is not an original idea by Shalizi, but the well known “sampling theory” of general intelligence first proposed by Godfrey Thomson almost a century ago. Shalizi states this very clearly in the post, and credits Thomson with the idea. However, for whatever reason, he fails to mention the very extensive discussions of this theory in the existing literature, and writes as if Thomson’s theory had been ignored ever since, which definitely doesn’t represent the actual situation accurately.
In a recent paper by van der Maas et al., which presents an extremely interesting novel theory of correlations that give rise to g (and which Shalizi links to at one point), the authors write:
Note that I take no position here about whether these criticisms of the sampling theory are correct or not. However, I think this quote clearly demonstrates that an attempt to write off g by merely invoking the sampling theory is not a constructive contribution to the discussion.
I would also add that if someone managed to construct multiple tests of mental ability that would sample disjoint sets of Thomsonesque underlying abilities and thus fail to give rise to g, it would be considered a tremendous breakthrough. Yet, despite the strong incentive to achieve this, nobody who has tried so far has succeeded. This evidence is far from conclusive, but far from insignificant either.
I think Shalizi isn’t too far off the mark in writing “as if Thomson’s theory had been ignored”. Although a few psychologists & psychometricians have acknowledged Thomson’s sampling model, in everyday practice it’s generally ignored. There are far more papers out there that fit g-oriented factor models as a matter of course than those that try to fit a Thomson-style model. Admittedly, there is a very good reason for that — Thomson-style models would be massively underspecified on the datasets available to psychologists, so it’s not practical to fit them — but that doesn’t change the fact that a g-based model is the go-to choice for the everyday psychologist.
There’s an interesting analogy here to Shalizi’s post about IQ’s heritability, now I think about it. Shalizi writes it as if psychologists and behaviour geneticists don’t care about gene-environment correlation, gene-environment interaction, nonlinearities, there not really being such a thing as “the” heritability of IQ, and so on. One could object that this isn’t true — there are plenty of papers out there concerned with these complexities — but on the other hand, although the textbooks pay lip service to them, researchers often resort to fitting models that ignore these speedbumps. The reason for this is the same as in the case of Thomson’s model: given the data available to scientists, models that accounted for these effects would usually be ruinously underspecified. So they make do.
However, it seems to me that the fatal problem of the sampling theory is that nobody has ever managed to figure out a way to sample disjoint sets of these hypothetical uncorrelated modules. If all practically useful mental abilities and all the tests successfully predicting them always sample some particular subset of these modules, then we might as well look at that subset as a unified entity that represents the causal factor behind g, since its elements operate together as a group in all relevant cases.
Or is there some additional issue here that I’m not taking into account?
I can’t immediately think of any additional issue. It’s more that I don’t see the lack of well-known disjoint sets of uncorrelated cognitive modules as a fatal problem for Thomson’s theory, merely weak disconfirming evidence. This is because I assign a relatively low probability to psychologists detecting tests that sample disjoint sets of modules even if they exist.
For example, I can think of a situation where psychologists & psychometricians have missed a similar phenomenon: negatively correlated cognitive tests. I know of a couple of examples which I found only because the mathematician Warren D. Smith describes them in his paper “Mathematical definition of ‘intelligence’ (and consequences)”. The paper’s about the general goal of coming up with universal definitions of and ways to measure intelligence, but in the middle of it is a polemical/sceptical summary of research into g & IQ.
Smith went through a correlation matrix for 57 tests given to 240 people, published by Thurstone in 1938, and saw that the 3 most negative of the 1596 intercorrelations were between these pairs of tests:
“100-word vocabulary test // Recognize pictures of hand as Right/Left” (correlation = −0.22)
“Find lots of synonyms of a given word // Decide whether 2 pictures of a national flag are relatively mirrored or not” (correlation = −0.16)
“Describe somebody in writing: score=# words used // figure recognition test: decide which numbers in a list of drawings of abstract figures are ones you saw in a previously shown list” (correlation = −0.12)
In Smith’s words: “This seems too much to be a coincidence!” Smith then went to the 60-item correlation matrix for 710 schoolchildren published by Thurstone & Thurstone in 1941 and did the same, discovering that
The existence of two pairs of negatively correlated cognitive skills leads me to increase my prior for the existence of uncorrelated cognitive skills.
Also, the way psychologists often analyze test batteries makes it harder to spot disjoint sets of uncorrelated modules. Suppose we have a 3-test battery, where test 1 samples uncorrelated modules A, B, C, D & E, test 2 samples F, G, H, I & J, and test 3 samples C, D, E, F & G. If we administer the battery to a few thousand people and extract a g from the results, as is standard practice, then by construction the resulting g is going to correlate with scores on tests 1 & 2, although we know they sample non-overlapping sets of modules. (IQ, being a weighted average of test/module scores, will also correlate with all of the tests.) A lot of psychologists would interpret that as evidence against tests 1 & 2 measuring distinct mental abilities, even though we see there’s an alternative explanation.
Even if we did find an index of intelligence that didn’t correlate with IQ/g, would we count it as such? Duckworth & Seligman discovered that in a sample of 164 schoolchildren, a composite measure of self-discipline predicted GPA significantly better than IQ, and self-discipline didn’t correlate significantly with IQ. Does self-discipline now count as an independent intellectual ability? I’d lean towards saying it doesn’t, but I doubt I could justify being dogmatic about that; it’s surely a cognitive ability in the term’s broadest sense.
satt:
That’s an extremely interesting reference, thanks for the link! This is exactly the kind of approach that this area desperately needs: no-nonsense scrutiny by someone with a strong math background and without an ideological agenda.
David Hilbert allegedly once quipped that physics is too important to be left to physicists; the way things are, it seems to me that psychometrics should definitely not be left to psychologists. That they haven’t immediately rushed to explore further these findings by Smith is an extremely damning fact about the intellectual standards in the field.
Wouldn’t this closely correspond to the Big Five “conscientiousness” trait? (Which the paper apparently doesn’t mention at all?!) From what I’ve seen, even among the biggest fans of IQ, it is generally recognized that conscientiousness is at least similarly important as general intelligence in predicting success and performance.
That’s an excellent point that completely did not occur to me. Turns out that self-discipline is actually one of the 6 subscales used to measure conscientiousness on the NEO-PI-R, so it’s clearly related to conscientiousness. With that in mind, it is a bit weird that conscientiousness doesn’t get a shoutout in the paper...
Is anything known about a physical basis for conscientiousness?
It can be reliably predicted by, for example, SPECT scans. If I recall correctly you can expect to see over-active frontal lobes and basal ganglia. For this reason (and because those areas depend on dopamine a lot) dopaminergics (Ritalin, etc) make a big difference.
I haven’t looked at Smith yet, but the quote looks like parody to me. Since you seem to take it seriously, I’ll respond. Awfully specific tests defying the predictions looks like data mining to me. I predict that these negative correlations are not replicable. The first seems to be the claim that verbal ability is not correlated with spatial ability, but this is a well-tested claim. As Shalizi mentions, psychometricians do look for separate skills and these are commonly accepted components. I wouldn’t be terribly surprised if there were ones they completely missed, but these two are popular and positively correlated. The second example is a little more promising: maybe that scattered Xs test is independent of verbal ability, even though it looks like other skills that are not, but I doubt it.
With respect to self-discipline, I think you’re experiencing some kind of halo effect. Not every positive mental trait should be called intelligence. Self-discipline is just not what people mean by intelligence. I knew that conscientiousness predicted GPAs, but I’d never heard such a strong claim. But it is true that a lot of people dismiss conscientiousness (and GPA) in favor of intelligence, and they seem to be making an error (or being risk-seeking).
Once you read the relevant passage in context, I anticipate you will agree with me that Smith is serious. Take this paragraph from before the passage I quoted from:
Smith then presents the example from Thurstone’s 1938 data.
I’d be inclined to agree if the 3 most negative correlations in the dataset had come from very different pairs of tests, but the fact that they come from sets of subtests that one would expect to tap similar narrow abilities suggests they’re not just statistical noise.
Smith himself does not appear to make that claim; he presents his two examples merely as demonstrations that not all mental ability scores positively correlate. I think it’s reasonable to package the 3 verbal subtests he mentions as strongly loading on verbal ability, but it’s not clear to me that the 3 other subtests he pairs them with are strong measures of “spatial ability”; two of them look like they tap a more specific ability to handle mental mirror images, and the third’s a visual memory test.
Even if it transpires that the 3 subtests all tap substantially into spatial ability, they needn’t necessarily correlate positively with specific measures of verbal ability, even though verbal ability correlates with spatial ability.
I’m tempted to agree but I’m not sure such a strong generalization is defensible. Take a list of psychologists’ definitions of intelligence. IMO self-discipline plausibly makes sense as a component of intelligence under definitions 1, 7, 8, 13, 14, 23, 25, 26, 27, 28, 32, 33 & 34, which adds up to 37% of the list of definitions. A good few psychologists appear to include self-discipline as a facet of intelligence.
Interesting thought. It turns out that Conscientiousness is actually negatively related to intelligence, while Openness is positively correlated with intelligence.
This finding is consistent with the folk notion of “crazy geniuses.”
Though it’s important to note that the second study was done on college students, who must have a certain level of IQ and who aren’t representative of the population.
The first study notes:
If we took a larger sample of the population, including lower IQ individuals, then I think we would see the negative correlation between Conscientiousness and intelligence diminish or even reverse, because I bet there are lots of people outside a college population who have both low intelligence and low Conscientiousness.
It could be that a moderate amount of Conscientiousness (well, whatever mechanisms cause Conscientiousness) is necessary for above average intelligence, but too much Conscientiousness (i.e. those mechanisms are too strong) limits intelligence.
I noticed a while back when a bunch of LW’ers gave their Big Five scores that our Conscientiousness scores tended to be low. I took that to be an internet thing (people currently reading a website are more likely to be lazy slobs) but this is a more flattering explanation.
No it doesn’t. The whole point of that article is that it’s a mistake to ask people how conscientious they are.
Interesting. I would’ve expected Conscientiousness to correlate weakly positively with IQ across most IQ levels.
I would avoid interpreting a negative correlation between C/self-discipline and IQ as evidence against C/self-discipline being a separate facet of intelligence; I think that would beg the question by implicitly assuming that IQ’s representing the entirety of what we call intelligence.
Just out of curiosity: is psychology your domain of expertise? You speak confidently and with details.
If only! I’m just a physics student but I’ve read a few books and quite a few articles about IQ.
[Edit: I’ve got an amateur interest in statistics as well, which helps a lot on this subject. Vladimir_M is right that there’s a lot of crap statistics peddled in this field.]
Ok, that’s interesting new stuff—I haven’t read this literature at all.
“All of this, of course, is completely compatible with IQ having some ability, when plugged into a linear regression, to predict things like college grades or salaries or the odds of being arrested by age 30. (This predictive ability is vastly less than many people would lead you to believe [cf.], but I’m happy to give them that point for the sake of argument.) This would still be true if I introduced a broader mens sana in corpore sano score, which combined IQ tests, physical fitness tests, and (to really return to the classical roots of Western civilization) rated hot-or-not sexiness. Indeed, since all these things predict success in life (of one form or another), and are all more or less positively correlated, I would guess that MSICS scores would do an even better job than IQ scores. I could even attribute them all to a single factor, a (for arete), and start treating it as a real causal variable. By that point, however, I’d be doing something so obviously dumb that I’d be accused of unfair parody and arguing against caricatures and straw-men.”
This is the point here. There’s a difference between coming up with linear combinations and positing real, physiological causes.
My beef isn’t with Shalizi’s reasoning, which is correct. I disagree with his text connotationally. Calling something a “myth” because it isn’t a causal factor and you happen to study causal factors is misleading. Most people who use g don’t need it to be a genuine causal factor; a predictive factor is enough for most uses, as long as we can’t actually modify dendrite density in living humans or something like that.
Ok, let’s talk connotations.
If g is a causal factor then “A has higher g than B” adds additional information to the statement “A scored higher than B on such-and-such tests.” It might mean, for instance, that you could look in A’s brain and see different structure than in B’s brain; it might mean that we would expect A to be better at unrelated, previously untested skills.
If g is not a causal factor, then comments about g don’t add any new information; they just sort of summarize or restate. That difference is significant.
A predictive factor is enough for predictive uses, but not for a lot of policy uses, which rely on causality. From your comment, I assume you are not a lefty, and that you think we should be more confident than we are about using IQ to make decisions regarding race. I think that Shalizi’s reasoning is likely not irrelevant to making those decisions; it should probably make us more guarded in practice.
I don’t understand your last paragraph. Could you give an example? Is this relevant to the decision of whether intelligence tests should be used for choosing firemen? or is that a predictive use?
The kinds of implications I’m thinking about are that if IQ causes X, (and if IQ is heritable) then we should not seek to change X by social engineering means, because it won’t be possible. X could be the distribution of college admittees, firemen, criminals, etc.
Not all policy has to rely on causal factors, of course. And my thinking is a little blurry on these issues in general.
Seconding Douglas_Knight’s question. I don’t understand why you say policy uses must rely on causal factors.
The way you define “real” properties, it seems you can’t tell them from “unreal” ones by looking at correlations alone; we need causal intervention for that, a la Pearl. So until we invent tech for modifying dendrite density of living humans, or something like that, there’s no practical difference between “real” g and “unreal” g and no point in making the distinction between them. In particular, their predictive power is the same.
So, basically, your and Shalizi’s demand for a causal factor is too strong. We can do with weaker tools.
Shalizi’s most basic point — that factor analysis will generate a general factor for any bunch of sufficiently strongly correlated variables — is correct.
Here’s a demo. The statistical analysis package R comes with some built-in datasets to play with. I skimmed through the list and picked out six monthly datasets (72 data points in each):
atmospheric CO2 concentrations, 1959-1964
female UK lung deaths, 1974-1979
international airline passengers, 1949-1954
sunspot counts, 1749-1754
average air temperatures at Nottingham Castle, 1920-1925
car drivers killed & seriously injured in Great Britain, 1969-1974
It’s pretty unlikely that there’s a single causal general factor that explains most of the variation in all six of these time series, especially as they’re from mostly non-overlapping time intervals. They aren’t even that well correlated with each other: the mean correlation between different time series is −0.10 with a std. dev. of 0.34. And yet, when I ask R’s canned factor analysis routine to calculate a general factor for these six time series, that general factor explains 1⁄3 of their variance!
However, Shalizi’s blog post covers a lot more ground than just this basic point, and it’s difficult for me to work out exactly what he’s trying to say, which in turn makes it difficult to say how correct he is overall. What does Shalizi mean specifically by calling g a myth? Does he think it is very unlikely to exist, or just that factor analysis is not good evidence for it? Who does he think is in error about its nature? I can think of one researcher in particular who stands out as just not getting it, but beyond that I’m just not sure.
In your example, we have no reason to privilege the hypothesis that there is an underlying causal factor behind that data. In the case of g, wouldn’t its relationships to neurobiology be a reason to give a higher prior probability to the hypothesis that g is actually measuring something real? These results would seem surprising if g was merely a statistical “myth.”
The best evidence that g measures something real is that IQ tests are highly reliable, i.e. if you get your IQ or g assessed twice, there’s a very good correlation between your first score and your second score. Something has to generate the covariance between retestings; that g & IQ also correlate with neurobiological variables is just icing on the cake.
To answer your question directly, g’s neurobiological associations are further evidence that g measures something real, and I believe g does measure something real, though I am not sure what.
Shalizi is, somewhat confusingly, using the word “myth” to mean something like “g’s role as a genuine physiological causal agent is exaggerated because factor analysis sucks for causal inference”, rather than its normal meaning of “made up”. Working with Shalizi’s (not especially clear) meaning of the word “myth”, then, it’s not that surprising that g correlates with neurobiology, because it is measuring something — it’s just not been proven to represent a single causal agent.
Personally I would’ve preferred Shalizi to use some word other than “myth” (maybe “construct”) to avoid exactly this confusion: it sounds as if he’s denying that g measures anything, but I don’t believe that’s his intent, nor what he actually believes. (Though I think there’s a small but non-negligible chance I’m wrong about that.)
From what I can gather, he’s saying all other evidence points to a large number of highly specialized mental functions instead of one general intelligence factor, and that psychologists are making a basic error by not understanding how to apply and interpret the statistical tests they’re using. It’s the latter which I find particularly unlikely (not impossible though).
You might be right. I’m not really competent to judge the first issue (causal structure of the mind), and the second issue (interpretation of factor analytic g) is vague enough that I could see myself going either way on it.
By the way, welcome to Less Wrong! Feel free to introduce yourself on that thread!
If you haven’t been reading through the Sequences already, there was a conversation last month about good, accessible introductory posts that has a bunch of links and links-to-links.
Thank you!
Belatedly: Economic development (including population growth?) is related to CO2, lung deaths, international airline passengers, average air temperatures (through global warming), and car accidents.
Here is a useful post directly criticizing Shalizi’s claims: http://humanvarieties.org/2013/04/03/is-psychometric-g-a-myth/
I don’t think it’s surprising that an untenable claim could persist within a field for a long time, once established. Pluto was called a planet for seventy-six years.
I’ve no idea whether the critique of g is accurate, however.
That’s a bizarre choice of example. The question of whether Pluto is a planet is entirely a definitional one; the IAU could make it one by fiat if they chose. There’s no particular reason for it not to be one, except that the IAU felt the increasing number of tranNeptunian objects made the current definition awkward.
“[E]ntirely a definitional” question does not mean “arbitrary and trivial”—some definitions are just wrong. EY mentions the classic example in Where to Draw the Boundary?:
Honestly, it would make the most sense to draw four lists, like the Hayden Planetarium did, with rocky planets, asteroids, gas giants, and Kuiper Belt objects each in their own category, but it is obviously wrong to include everything from Box 1 and Box 3 and one thing from Box 4. The only reason it was done is because they didn’t know better and didn’t want to change until they had to.
You (well, EY) make a good point, but I think neither the Pluto remark nor the fish one is actually an example of this.
In the case of Pluto, the transNeptunians and the other planets seem to belong in a category that the asteroids don’t. They’re big and round! Moreover, they presumably underwent a formation process that the asteroid belt failed too complete in the same way (or whatever the current theory of formation of the asteroid belt is; I think that it involves failure to form a “planet” due to tidal forces from Jupiter?). Of course there are border cases like Ceres, but I think there is a natural category (whatever that means!) that includes the rocky planets, gas giants and Kuiper Belt objects that does not include (most) asteroids and comets.
On the fish example, I claim that the definition of “fish” that includes the modern definition of fish union the cetaceans is a perfectly valid natural category, and that this is therefore an intensional definition. “Fish” are all things that live in the water, have finlike or flipperlike appendages and are vaguely hydrodynamic. The fact that such things do not all share a comment descent* is immaterial to the fact that they look the same and act the same at first glance. As human knowledge has increased, we have made a distinction between fish and things that look like fish but aren’t, but we reasonably could have kept the original definition of fish and called the scientific concept something else, say “piscoids”.
*well, actually they do, but you know approximately what I mean.
Nitpick: if in your definition of fish, you mean that they need to both have fins or flippers and be (at least) vaguely hydrodynamic, I don’t think seahorses and puffer fish qualify.
The usual term is “monophyletic”.
Yes, but neither fish nor (fish union cetaceans) is monphylatic. The decent tree rooted at the last common ancestor of fish also contains tetrapods and decent tree rooted at the last common ancestor of tetrapods contains the cetaceans.
I am not any sort of biologist, so I am unclear on the terminological technicalties, which is why I handwaved this in my post above.
Fish are a paraphyletic group.
I’m inclined to agree. Having a name for ‘things that naturally swim around in the water, etc’ is perfectly reasonable and practical. It is in no way a nitwit game.