gwern comments on 2012 Survey Results

gwern 30 Nov 2012 3:27 UTC
37 points
0
I previously mentioned that item non-response might be a good measure of Conscientiousness. Before doing anything fancy with non-response, I first checked that there was a correlation with the questionnaire reports. The correlation is zero:
```
R> lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC))))
R> missing_answers ← apply(lwc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==” ”)))
R> cor.test(as.integer(as.character(lwc$BigFiveC)), missing_answers)

    Pearson’s product-moment correlation

data:  as.integer(as.character(lwc$BigFiveC)) and missing_answers
t = −0.0061, df = 421, p-value = 0.9952
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 −0.09564  0.09505
sample estimates:
       cor
−0.0002954
# visualize to see if we made some mistake somewhere
R> plot(as.integer(as.character(lwc$BigFiveC)), missing_answers)
```
I am completely surprised. The results in the economics paper looked great and the rationale is very plausible. Yet… The 2 sets of data here have the right ranges, there’s plenty of variation in both dimension, I’m sure I’m catching most of the item non-responses or NAs given that there are non-responses as high as 34, there’s a lot of datapoints, and it’s not that the correlation is the opposite direction which might indicate a coding error but that there’s none at all. Yvain questions the Big Five results, but otherwise they look exactly as I would’ve predicted before seeing the results: low C and E and A, high O, medium N.

There may be something very odd about LWers and Conscientiousness; when I try C vs Income, there’s a almost-zero correlation again:
```
R> cor.test(as.integer(as.character(lwc$BigFiveC)), log1p(as.integer(lwc$Income)))

    Pearson’s product-moment correlation

data:  as.integer(as.character(lwc$BigFiveC)) and log1p(as.integer(lwc$Income))
t = 0.2178, df = 421, p-value = 0.8277
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 −0.08482  0.10585
sample estimates:
    cor
0.01061
```
I guess the next step is a linear model on income vs age, Conscientiousness, and IQ:
```
lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC)))))
lwc ← subset(lw, !is.na(as.integer(as.character(Age))))
lwc ← subset(lw, !is.na(as.integer(as.character(IQ))))
lwc ← subset(lw, !is.na(as.integer(as.character(Income))))
c ← as.integer(as.character(lwc$BigFiveC))
age ← as.integer(as.character(lwc$Age))
iq ← as.integer(as.character(lwc$IQ))
income ← log1p(as.integer(as.character(lwc$Income)))
summary(lm(income ~ (age + iq + c)))

Call:
lm(formula = income ~ (age + iq + c))

Residuals:
   Min     1Q Median     3Q    Max
−8.762 −0.849  1.191  2.319  3.644

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  −0.5531     3.5479   −0.16     0.88
age           0.1311     0.0323    4.06  9.5e-05
iq            0.0339     0.0267    1.27     0.21
c             0.0174     0.0121    1.44     0.15

Residual standard error: 3.35 on 106 degrees of freedom
  (489 observations deleted due to missingness)
Multiple R-squared: 0.196,    Adjusted R-squared: 0.173
F-statistic: 8.59 on 3 and 106 DF,  p-value: 3.73e-05
```
So all of them combined don’t explain much and most of the work is being done by the age variable… There’s many high-income LWers, supposedly (in this subset of respondents reporting age, income, IQ, and Conscientiousness, the max is 700,000), so I’d expect a cumulative r^2 of more than 0.173 for all 3 variables; if those aren’t governing income, what is? Maybe everyone working with computers is rich and the others poor? Let’s look at everyone who submitted salary and profession and see whether the practical computer people are making bank:
```
lwi <- subset(lw, !is.na(as.integer(as.character(Income))))
lwi ← subset(lwi, !is.na(as.character(Profession)))
cs ← as.integer(as.character(lwi[as.character(lwi$Profession)==”Computers (practical: IT, programming, etc.)”,]$Income))
others ← as.integer(as.character(lwi[as.character(lwi$Profession)!=”Computers (practical: IT, programming, etc.)”,]$Income))
# ordinary t-test, but we’ll exclude anyone with zero income (unemployed?)
t.test(cs[cs!=0],others[others!=0])

    Welch Two Sample t-test

data:  cs[cs != 0] and others[others != 0]
t = 5.905, df = 309.3, p-value = 9.255e-09
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 22344 44673
sample estimates:
mean of x mean of y
    76458     42950
```
Wow. Just wow. 76k vs 43k. I mean, maybe this would go away with enough fiddling (eg. cost-of-living) but it’s still dramatic. This suggests a new theory to me: maybe Conscientiousness does correlate with income at its usual high rate for everyone but computer people who are simply in so high demand that lack of Conscientiousness doesn’t matter:
```
R> lwi <- subset(lw, !is.na(as.integer(as.character(Income))))
R> lwi ← subset(lwi, !is.na(as.character(Profession)))
R> lwi ← subset(lwi, !is.naBigFiveC)))))
R> cs ← lwi[as.character(lwi$Profession)==”Computers (practical: IT, programming, etc.)”,]
R> others ← lwi[as.character(lwi$Profession)!=”Computers (practical: IT, programming, etc.)”,]
R> cor.test(as.integer(as.character(cs$BigFiveC)), as.integer(as.character(cs$Income)))

    Pearson’s product-moment correlation

data:  as.integer(as.character(cs$BigFiveC)) and as.integer(as.character(cs$Income))
t = 0.5361, df = 87, p-value = 0.5933
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 −0.1527  0.2625
sample estimates:
    cor
0.05738
R> cor.test(as.integer(as.character(others$BigFiveC)), as.integer(as.character(others$Income)))

    Pearson’s product-moment correlation

data:  as.integer(as.character(others$BigFiveC)) and as.integer(as.character(others$Income))
t = 1.997, df = 200, p-value = 0.04721
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.001785 0.272592
sample estimates:
   cor
0.1398
```
So for the CS people the correlation is small and non-statistically-significant, for non-CS people the correlation is almost 3x larger and statistically-significant.
What links here?
- gwern's comment on 2012 Less Wrong Census/Survey by Scott Alexander (26 Nov 2012 21:21 UTC; 13 points)
- gwern's comment on Participation in the LW Community Associated with Less Bias by Unnamed (9 Dec 2012 23:04 UTC; 10 points)
- Kindly 30 Nov 2012 4:07 UTC
  22 points
  0
  Parent
  There is a correlation of 0.13 between non-responses and N.
  
  Of course, there’s also a correlation of −0.13 between C and the random number generator.
  - A1987dM 30 Nov 2012 10:48 UTC
    16 points
    0
    Parent
    People who had seen the RNG give a large number were primed to feel unusually reckless when taking the Big 5 test. Duh. (Just kidding.)
- NancyLebovitz 30 Nov 2012 5:13 UTC
  6 points
  0
  Parent
  Were you expecting that people with high C would or wouldn’t skip questions? I can see arguments either way. Conscientious people might skip questions they don’t have answers to or that they aren’t willing to put the time into to give a good answer, or they might put in the work to have answers they consider good to as many questions as possible.
  
  Is it feasible to compare wrong sort of answer with C?
  
  Is it possible that the test for C wasn’t very good?
  - gwern 30 Nov 2012 5:19 UTC
    8 points
    0
    Parent
    
    Were you expecting that people with high C would or wouldn’t skip questions?
    
    Wouldn’t; that was the claim of the linked paper.
    
    Is it feasible to compare wrong sort of answer with C?
    
    Not really, if it wasn’t caught by the no-answer check or the NA check.
    
    Is it possible that the test for C wasn’t very good?
    
    As I said, it came out as expected for LW as a whole, and it did correlate with income once the CS salaries were removed… Hard to know what ground-truth there could be to check the scores against.
- Vaniver 30 Nov 2012 5:47 UTC
  4 points
  0
  Parent
  I am also surprised by this. I wonder about the effect of “I’m taking this survey so I don’t have to go to bed / do work / etc.,” but I wouldn’t have expected that to be as large as the diligence effect.
  
  Also, perhaps look at nonresponse by section? I seem to recall the C part being after the personality test, which might be having some selection effects.
  - gwern 30 Nov 2012 16:13 UTC
    1 point
    0
    Parent
    
    Also, perhaps look at nonresponse by section? I seem to recall the C part being after the personality test, which might be having some selection effects.
    
    What do you mean? I can’t compare non-response with anyone who didn’t supply a C score, and there were plenty of questions to non-response on after the personality test section.
    - Vaniver 30 Nov 2012 17:28 UTC
      3 points
      0
      Parent
      It seems to me that other survey non-response may be uncorrelated with C once you condition on taking a long personality survey, especially if the personality survey doesn’t allow nonresponse. (I seem to recall taking all of the optional surveys and considering the personality one the most boring. I don’t know how much that generalizes to other people.) The first way that comes to mind to gather information for this is to compare the nonresponse of people who supplied personality scores and people who didn’t, but that isn’t a full test unless you can come up with another way to link the nonresponse to C.
      
      I was thinking it might help to break down the responses by section, and seeing if nonresponse to particular sections was correlated with C, but the result could only be that some sections are anticorrelated if a few are correlated. So that probably won’t get you anything.
      - gwern 1 Dec 2012 0:21 UTC
        1 point
        0
        Parent
        
        It seems to me that other survey non-response may be uncorrelated with C once you condition on taking a long personality survey, especially if the personality survey doesn’t allow nonresponse. (I seem to recall taking all of the optional surveys and considering the personality one the most boring. I don’t know how much that generalizes to other people.)
        
        Why would the strong correlation go away after adding a floor? That would simply restrict the range… if that were true, we’d expect to see a cutoff for all C scores but in fact we see plenty of very low C scores being reported.
        
        The first way that comes to mind to gather information for this is to compare the nonresponse of people who supplied personality scores and people who didn’t, but that isn’t a full test unless you can come up with another way to link the nonresponse to C.
        
        Yes. You’d expect, by definition, that people who answered the personality questions would have fewer non-responses than the people who didn’t… That’s pretty obvious and true:
        
        R> lwc <- subset(lw, !is.na(as.integer(as.character(BigFiveC)))) R> missing_answers1 ← apply(lwc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==” ”))) R> lwnc ← subset(lw, is.na(as.integer(as.character(BigFiveC)))) R> missing_answers2 ← apply(lwnc, 1, function(x) sum(sapply(x, function(y) is.na(y) || as.character(y)==” ”))) R> t.test(missing_answers1, missing_answers2) Welch Two Sample t-test data: missing_answers1 and missing_answers2 t = −25.19, df = 806.8, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: −18.77 −16.05 sample estimates: mean of x mean of y 9.719 27.129