Wait, I’m not sure we’re talking about the same thing. I’m saying direct replication isn’t the most useful way to spend time. You’re talking about systematic experiment design flaws.
According to your writing, the failures in this example stem from methodological issues (not using an active control group). A direct replication of the n-back-IQ transfer would have just hit p<.05 again, as it would have had the same methodological issues. Of course, if the methodological issue is not repaired, all subsequent findings will suffer from the same issues.
I’m strictly saying that direct replication isn’t useful. Rigorous checking of methods and doing it over again correctly where there is a failure in the documented methodology is always a good idea.
But the Jaeggi cluster also sometimes use active control groups, with various kinds of differences in the intervention, metrics, and interpretations. In fact, Jaeggi was co-author on a new dual n-back meta-analysis released this month*; the meta-analysis finds the passive-active difference I did, and you know what their interpretation is? That it’s due to the correlated classification of US vs international laboratories conducting particular experiments. (It never even occurred to me to classify the studies this way.) They note that sometimes psychology experiments reach different conclusions in other cultures/countries—which they do—so perhaps the lower results in American studies using active control groups is because Americans gain less from n-back training. The kindest thing I can say about this claim is that I may be able to falsify it with my larger collection of studies (they threw out or missed a lot).
So, after performing these conceptual extensions of their results—as you suggest—they continue to
...slowly wend [their] way through a tenuous nomological network, performing a long series of related experiments which appear to the uncritical reader as a fine example of “an integrated research program”, without ever once refuting or corroborating so much as a single strand of the network.
Wait, I’m not sure we’re talking about the same thing. I’m saying direct replication isn’t the most useful way to spend time. You’re talking about systematic experiment design flaws.
According to your writing, the failures in this example stem from methodological issues (not using an active control group). A direct replication of the n-back-IQ transfer would have just hit p<.05 again, as it would have had the same methodological issues. Of course, if the methodological issue is not repaired, all subsequent findings will suffer from the same issues.
I’m strictly saying that direct replication isn’t useful. Rigorous checking of methods and doing it over again correctly where there is a failure in the documented methodology is always a good idea.
But the Jaeggi cluster also sometimes use active control groups, with various kinds of differences in the intervention, metrics, and interpretations. In fact, Jaeggi was co-author on a new dual n-back meta-analysis released this month*; the meta-analysis finds the passive-active difference I did, and you know what their interpretation is? That it’s due to the correlated classification of US vs international laboratories conducting particular experiments. (It never even occurred to me to classify the studies this way.) They note that sometimes psychology experiments reach different conclusions in other cultures/countries—which they do—so perhaps the lower results in American studies using active control groups is because Americans gain less from n-back training. The kindest thing I can say about this claim is that I may be able to falsify it with my larger collection of studies (they threw out or missed a lot).
So, after performing these conceptual extensions of their results—as you suggest—they continue to
So it goes.
* http://www.gwern.net/docs/dnb/2014-au.pdf / https://pdf.yt/d/VMPWmd0jpDYvZIjm / https://dl.dropboxusercontent.com/u/85192141/2014-au.pdf ; initial comments on it: https://groups.google.com/forum/#!topic/brain-training/GYqqSyfqffA