gwern comments on Rationality Quotes September 2011

gwern 10 Feb 2013 22:03 UTC
0 points
0
Trying to rederive the constants doesn’t help me, which is starting to make me wonder if he’s really using the table he provided or misstated an equation or something:
```
R> sf <- function(iq,f,c) ((c/2) * (1 + erf((iq-f)/(15*sqrt(2)))))
R> summary(nls(rGDPpc ~ sf(IQ,f,c), lynn, start=list(f=110,c=40000)))

Formula: rGDPpc ~ sf(IQ, f, c)

Parameters:
  Estimate Std. Error t value Pr(>|t|)
f    99.64       3.07   32.44  < 2e-16
c 34779.17    6263.90    5.55  3.7e-07

Residual standard error: 5310 on 79 degrees of freedom

Number of iterations to convergence: 4
Achieved convergence tolerance: 8.22e-06
```
If you double 34779 you get very close to his $69,321 so there might be something going wrong due to the ¹⁄₂ that appears in uses of the erf to make a cumulative distribution function, but I don’t how a threshold of 99.64 IQ is even close to his 108!

(The weird start values were found via trial-and-error in trying to avoid R’s ‘singular gradient error’; it doesn’t appear to make a difference if you start with, say, f=90.)
- Vaniver 11 Feb 2013 2:12 UTC
  2 points
  0
  Parent
  Most importantly, we appear to have figured out the answer to my original question: no, it is not easy. :P
  
  So, I started off by deleting the eight outliers to make lynn2. I got an adjusted R^2 of 0.8127 for the exponential fit, and 0.7777 for the fit with iq0=108.2.
  
  My nls came back with an optimal iq0 of 110, which is closer to the 108 I was expecting; the adjusted R^2 only increases to 0.7783, which is a minimal improvement, and still slightly worse than the exponential fit.
  
  The value of the smart fraction cutoff appears to have a huge impact on the mapping from smart fraction to gdp, but doesn’t appear to have a significant effect on the goodness of fit, which troubles me somewhat. I’m also surprised that deleting the outliers seems to have improved the performance of the exponential fit more than the smart fraction fit, which is not what I would have expected from the graphs. (Though, I haven’t calculated this with the outliers included in R, and I also excluded the Asian data, and there’s more fiddling I can do, but I’m happy with this for now.)
```
> sf <- function(iq,iq0) ((1+erf((iq-iq0)/(15*sqrt(2))))/2)
> egdp <- function(iq,iq0,m,b) (m*sf(iq,iq0)+b)
> summary(nls(rGDPpc ~ egdp(IQ,iq0,m,b), lynn2, start=list(iq0=110,m=40000,b=0)))
Formula: rGDPpc ~ egdp(IQ, iq0, m, b)
Parameters:
Estimate Std. Error t value Pr(>|t|)
iq0   110.019      4.305  25.556  < 2e-16 ***
m   77694.174  26708.502   2.909  0.00486 **
b     679.688   1039.144   0.654  0.51520
Residual standard error: 4054 on 70 degrees of freedom

> gwe <- lm(lynn2$rGDPpc ~ sf(lynn2$IQ,99.64)); summary(gwe)
Call:
lm(formula = lynn2$rGDPpc ~ sf(lynn2$IQ, 99.64))
Residuals:
 Min       1Q   Median       3Q      Max
-10621.6  -2463.1    442.6   1743.4  12439.7
Coefficients:
                Estimate Std. Error t value Pr(>|t|)
(Intercept)          -1345.7      886.6  -1.518    0.133
sf(lynn2$IQ, 99.64)  40552.7     2724.1  14.887   <2e-16 ***
Residual standard error: 4241 on 71 degrees of freedom
Multiple R-squared: 0.7574,     Adjusted R-squared: 0.7539
F-statistic: 221.6 on 1 and 71 DF,  p-value: < 2.2e-16

> opt <- lm(lynn2$rGDPpc ~ sf(lynn2$IQ,110)); summary(opt)
Call:
lm(formula = lynn2$rGDPpc ~ sf(lynn2$IQ, 110))
Residuals:
 Min       1Q   Median       3Q      Max
-11030.3  -1540.1   -416.8   1308.6  12493.5
Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)          676.4      731.5   0.925    0.358
sf(lynn2$IQ, 110)  77577.0     4869.6  15.931   <2e-16 ***
Residual standard error: 4025 on 71 degrees of freedom
Multiple R-squared: 0.7814,     Adjusted R-squared: 0.7783
F-statistic: 253.8 on 1 and 71 DF,  p-value: < 2.2e-16

> his <- lm(lynn2$rGDPpc ~ sf(lynn2$IQ,108.2)); summary(his)
Call:
lm(formula = lynn2$rGDPpc ~ sf(lynn2$IQ, 108.2))
Residuals:
 Min       1Q   Median       3Q      Max
-11014.0  -1710.0   -196.9   1396.3  12432.9
Coefficients:
                Estimate Std. Error t value Pr(>|t|)
(Intercept)            362.6      748.0   0.485    0.629
sf(lynn2$IQ, 108.2)  67711.5     4258.5  15.900   <2e-16 ***
Residual standard error: 4031 on 71 degrees of freedom
Multiple R-squared: 0.7807,     Adjusted R-squared: 0.7777
F-statistic: 252.8 on 1 and 71 DF,  p-value: < 2.2e-16

> em <- lm(log(lynn2$rGDPpc) ~ lynn2$IQ); summary(em)
Call:
lm(formula = log(lynn2$rGDPpc) ~ lynn2$IQ)
Residuals:
 Min       1Q   Median       3Q      Max
-1.12157 -0.34268 -0.00503  0.29596  1.41540
Coefficients:
        Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.812650   0.446308   1.821   0.0728 .
lynn2$IQ    0.089439   0.005051  17.706   <2e-16 ***
Residual standard error: 0.4961 on 71 degrees of freedom
Multiple R-squared: 0.8153,     Adjusted R-squared: 0.8127
F-statistic: 313.5 on 1 and 71 DF,  p-value: < 2.2e-16
```
  - gwern 11 Feb 2013 2:35 UTC
    4 points
    0
    Parent
    Most importantly, we appear to have figured out the answer to my original question: no, it is not easy. :P
    
    And inadvertently provided an object lesson for anyone watching about the value of researchers providing code...
    
    The value of the smart fraction cutoff appears to have a huge impact on the napping from smart fraction to gdp, but doesn’t appear to have a significant effect on the goodness of fit, which troubles me somewhat. I’m also surprised that deleting the outliers seems to have improved the performance of the exponential fit more than the smart fraction fit, which is not what I would have expected from the graphs.
    
    My intuition so far is that La Griffe found a convoluted way of regressing on a sigmoid, and the gain is coming from the part which looks like an exponential. I’m a little troubled that his stuff is so hard to reproduce sanely and that he doesn’t compare against the exponential fit: the exponent is obvious, has a reasonable empirical justification. Granting that Dickerson published in 2006 and he wrote the smart fraction essay in 2002 he could at least have updated.
    
    [edit] Sorry, it looks like the formatting for my code is totally ugly.
    
    You need to delete any trailing whitespace in your indented R terminal output. (Little known feature of LW/Reddit Markdown code blocks: one or more trailing spaces causes the newline to be ignored and the next line glommed on. I filed an R bug to fix some cases of it but I guess it doesn’t cover nls or you don’t have an updated version.)
    
    I don’t understand your definition
    
    egdp <- function(iq,iq0,m,b) (m*sf(iq,iq0)+b)
    sf(iq,iq0) makes sense, of course, and m presumably is the multiplicative scale constant LG found to be 69k, but what is this b here and why is it being added? I don’t see how this tunes how big a smart fraction is necessary since shouldn’t it then be on the inside of sf somehow?
    
    But using that formula and running your code (using the full dataset I posted originally, with outliers):
    
    R> erf <- function(x) 2 * pnorm(x * sqrt(2)) - 1 R> sf <- function(iq,iq0) ((1+erf((iq-iq0)/(15*sqrt(2))))/2) R> egdp <- function(iq,iq0,m,b) (m*sf(iq,iq0)+b) R> summary(nls(rGDPpc ~ egdp(IQ,iq0,m,b), lynn, start=list(iq0=110,m=40000,b=0))) Formula: rGDPpc ~ egdp(IQ, iq0, m, b) Parameters: Estimate Std. Error t value Pr(>|t|) iq0 102.08 4.89 20.88 < 2e-16 m 37108.87 9107.73 4.07 0.00011 b 1140.94 1445.76 0.79 0.43241 Residual standard error: 5320 on 78 degrees of freedom Number of iterations to convergence: 7 Achieved convergence tolerance: 5.09e-06
    - gwern 7 Dec 2013 23:02 UTC
      6 points
      0
      Parent
      
      My intuition so far is that La Griffe found a convoluted way of regressing on a sigmoid, and the gain is coming from the part which looks like an exponential. I’m a little troubled that his stuff is so hard to reproduce sanely and that he doesn’t compare against the exponential fit: the exponent is obvious, has a reasonable empirical justification. Granting that Dickerson published in 2006 and he wrote the smart fraction essay in 2002 he could at least have updated.
      
      I emailed La Griffe via Steve Sailer in February 2013 with a link to this thread and a question about how his smart-fraction model works with the fresher IQ/nations data and compares to Dickerson’s work. Sailer forwarded my email, but neither of us has had a reply since; he speculated that La Griffe may be having health issues.
      
      In the absence of any defense by La Griffe, I think Dicker’s exponential works better than La Griffe’s fraction/sigmoid.
    - Vaniver 11 Feb 2013 3:25 UTC
      0 points
      0
      Parent
      
      he doesn’t compare against the exponential fit: the exponent is obvious, has a reasonable empirical justification.
      
      The theoretical justifications are entirely different, though. It seems reasonable to me to suppose there’s some minimal intelligence to be wealth-producing in an industrial society, and the smart fraction estimates that well and it predicts gdp well. But, it also seems reasonable to treat log(gdp) as a more meaningful object than gdp.
      
      It’s also bothersome that the primary empirical prediction of the smart fraction model (that there is some stable gdp level that you hit when everyone is higher than the smart fraction) is entirely from the extrapolated part of the dataset, and this doesn’t seem noticeably better than the exponential model, whose extrapolations are radically different.
      
      Granting that Dickerson published in 2006 and he wrote the smart fraction essay in 2002 he could at least have updated.
      
      Yeah; I’m curious what they’d have to say about the relative merits of the two models. I’ll see if I can get this question to them.
      
      You need to delete any trailing whitespace in your indented R terminal output.
      
      Fixed, thanks!
      
      but what is this b here and why is it being added?
      
      It’s an offset, so that it’s an affine fit rather than a linear fit: the gdp level for a population with no people above 108 IQ doesn’t have to be 0. Turns out, it’s not significantly different from zero, but I’d rather discover that than enforce it (and enforcing it can degrade the value for m).
      - gwern 11 Feb 2013 3:46 UTC
        0 points
        0
        Parent
        
        But, it also seems reasonable to treat log(gdp) as a more meaningful object than gdp.
        
        I’m not entirely sure… For individuals, log-transforms make sense on their own merits as giving a better estimate of the utility of that money, but does that logic really apply to a whole country? More money means more can be spent on charity, shooting down asteroids, etc.
        
        It’s also bothersome that the primary empirical prediction of the smart fraction model (that there is some stable gdp level that you hit when everyone is higher than the smart fraction) is entirely from the extrapolated part of the dataset, and this doesn’t seem noticeably better than the exponential model, whose extrapolations are radically different.
        
        The next logical step would be to bring in the second 2006 edition of the Lynn dataset, which increased the set from 81 to 113, and use the latest available per-capita GDP (probably 2011). If the exponential fit gets better compared to the smart-fraction sigmoid, then that’s definitely evidence towards the conclusion that the smart-fraction is just a bad fit.
        
        Yeah; I’m curious what they’d have to say about the relative merits of the two models. I’ll see if I can get this question to them.
        
        I’d guess that he’d consider SF a fairly arbitrary model and not be surprised if an exponential fits better.
        
        It’s an offset, so that it’s an affine fit rather than a linear fit: the gdp level for a population with no people above 108 IQ doesn’t have to be 0. Turns out, it’s not significantly different from zero, but I’d rather discover that than enforce it (and enforcing it can degrade the value for m).
        
        Why can’t the GDP be 0 or negative? Afghanistan and North Korea are right now exhibiting what such a country looks like: they can barely feed themselves and export so much violence or fundamentalism or other dysfunctionality that rich nations are sinking substantial sums of money into supporting them and fixing problems.
        Vaniver 11 Feb 2013 16:28 UTC
        0 points
        0
        Parent
        
        For individuals, log-transforms make sense on their own merits as giving a better estimate of the utility of that money, but does that logic really apply to a whole country?
        
        The argument would be that additional intelligence multiplies the per-capita wealth-producing apparatus that exists, rather than adding to it (or, in the smart fraction model, not doing anything once you clear a threshold).
        
        Why can’t the GDP be 0 or negative?
        
        There’s no restriction that b be positive, and so those are both options. I wouldn’t expect it to be negative because pre-industrial societies managed to survive, but that presumes that aid spending by the developed world is not subtracted from the GDP measurement of those countries. Once you take aid into account, then it does seem reasonable that places could become money pits.
        gwern 11 Feb 2013 18:13 UTC
        0 points
        0
        Parent
        
        The argument would be that additional intelligence multiplies the per-capita wealth-producing apparatus that exists, rather than adding to it (or, in the smart fraction model, not doing anything once you clear a threshold).
        
        That’s the intuitive justification for an exponential model (each additional increment of intelligence adds a percentage of the previous GDP), but I don’t see how this justifies looking at log transforms.
        
        There’s no restriction that b be positive, and so those are both options. I wouldn’t expect it to be negative because pre-industrial societies managed to survive
        
        The difference would be a combination of negative externalities and changing Malthusian equilibriums: it has never been easier for an impoverished country like North Korea or Afghanistan to export violence and cause massive costs they don’t bear (9/11 directly cost the US something like a decade of Afghanistan GDP once you remove all the aid given to Afghanistan), and public health programs like vaccinations enable much larger populations than ‘should’ be there.
        Vaniver 11 Feb 2013 18:31 UTC
        0 points
        0
        Parent
        
        That’s the intuitive justification for an exponential model (each additional increment of intelligence adds a percentage of the previous GDP), but I don’t see how this justifies looking at log transforms.
        
        GDP ~ exp(IQ) is isomorphic to ln(GDP) ~ IQ, and I think log(dollars per year) is an easier unit to think about than something to the power of IQ.
        
        [edit] The graph might look different, though. It might be instructive to compare the two, but I think the relationships should be mostly the same.
        Kindly 11 Feb 2013 22:16 UTC
        5 points
        0
        Parent
        It’s worth pointing out that IQ numbers are inherently non-parametric: we simply have a ranking of performance on IQ tests, which are then scaled to fit a normal distribution.
        
        If GDP ~ exp(IQ), that means that the correlation is better if we scale the rankings to fit a log-normal distribution instead (this is not entirely true because exp(mean(IQ)) is not the same as mean(exp(IQ)), but the geometric mean and arithmetic mean should be highly correlated with each other as well). I suspect that this simply means that GDP approximately follows a log-normal distribution.
        Vaniver 11 Feb 2013 23:32 UTC
        1 point
        0
        Parent
        
        I suspect that this simply means that GDP approximately follows a log-normal distribution.
        
        This doesn’t quite follow, since both per capita GDP and mean national IQ aren’t drawn from the same sort of distribution as individual production and individual IQ are, but I agree with the broader comment that it is natural to think of the economic component of intelligence measured in dollars per year as lognormally distributed.