Link: “When Science Goes Psychic”
A major psychology journal is planning to publish a study that claims to present strong evidence for precognition. Naturally, this immediately stirred up a firestorm. There are a lot of scientific-process and philosophy-of-science issues involved, including replicability, peer review, Bayesian statistics, and degrees of scrutiny. The Flying Spaghetti Monster makes a guest appearance.
Original New York Times article on the study here.
And the Times asked a number of academics (including Douglas Hofstadter) to comment on the controversy. The discussion is here.
I, for one, defy the data.
One lesson of the common misuse of statistics is to not “defy the data” until you’re sure what it says.
Here’s an important reply cited in the other threads:
http://www.ruudwetzels.com//articles/Wagenmakersetal_subm.pdf
I read the NYT link yesterday or something, and IIRC, they mention somewhere that the statisticians had already found major flaws—like that. I’m a little surprised anyone feels a need to ‘defy the data’.
Hofstadter’s response is quite to the point.
I also liked “No Sacred Mantle”—I believe we should promote more widely some basic techniques for critical reading of science-related claims. I recently pushed out an essay on “Fact and folklore in software engineering” that went ridiculously viral thanks to HN, even though I’ll be the first to admit it’s not my clearest writing.
I think there’s a lot of pent-up demand for things like “how to read a popular article reporting on a science fact”, “how to read a scientific paper in a field you don’t know”, etc. No surprise there—after all, in terms of built-in preferences rationality is primarily about defending yourself from “facts” that you shouldn’t accept.
I’d like to see that. Or, rather than a how-to synthesis, how about some relatively raw data? A series of postings linking to scientific articles which got some initial positive play in the popular press, but later were convincingly critiques/debunked in the blogosphere.
Good science is all alike. Each example of bad science may be bad in its own individual way. (HT to LT).
Institut Agile? So advocacy for “agile practices” is your day job? Now I understand why our earlier conversation about TDD went so weirdly.
How’s that?
The implication seems to be that my job makes me biased about the topic. If so, that’s precisely the wrong conclusion to draw.
The job isn’t just advocacy, it’s also (at the moment mostly) research and, where necessary, debunking of Agile. (For instance, learning more about probability theory has made me more skeptical of “planning poker”.)
Prior to creating that job from scratch (including getting private funding to support my doing that job full-time), I’d supported myself by selling consulting and training as an expert on Scrum, Extreme Programming and Agile.
Institut Agile was the result of a conscious decision on my part to move to a professional position where I’d be able to afford a more rational assessment of the topic. For instance, I’m compiling an extensive bibliography of the existing empirical studies published that have attempted to verify the claimed benefits of TDD, and reviews and meta-analyses of these studies.
I’m quite interested in thoughtful critiques of TDD, provided that such criticism is expressed from a position of actually knowing something about the topic, or being willing to find out what the claims concerning TDD actually are.
To use a well-known form, if TDD works I desire to believe that TDD works, and if TDD doesn’t work I desire to believe that it doesn’t work.
From my point of view, our earlier conversation about TDD went weirdly because your responses stopped making sense for me starting from this one. For a while I attempted to correct for misunderstanding on my part and glean more information from you that could potentially change my mind, until that started looking like a lost cause.
Is it available online?
Do you always answer a question with another question?
I’m planning to make it available on the Institut Agile group on Mendeley. It’s intended to cover the entire set of agile practices; for instance, what’s relevant to TDD consists of the tags “bdd”, “tdd”, “unittest” and “refactoring”
What is there right now is a subset only, though—I’m feeding the online set from a local BibDesk file which is still growing. (I’m also having some issues with the synchronization between the local file and Mendeley—there are some duplicates in the online set right now.) So that only has two articles tagged “tdd” proper.
The local file has 68 papers on 10 practices. Of these, 6 for “tdd”, 4 for “unittest”, 13 for “refactoring”, 2 for “bdd”. There are additional citations, that I haven’t yet copied over, in the two most recent surveys of the topic that I’ve come across: one is a chapter on TDD in the O’Reilly book “Making Software: What Really Works, and Why We Believe It”, the other is an article by an overlapping set of authors in the winter (I think) issue of IEEE Software.
Another source is the set of proceedings of the Agile conference in the US, and the XP conference in Europe, which have run for about 10 years each now. Most of the articles from these are behind paywalls (at IEEE and Springer respectively), but I’m hoping to leverage my position as a member of the Agile Alliance board to set them free.
Anyway, that work is my answer to Hamming’s questions. It may not be the best answer, but I’m happy enough that I do have an answer.
Yep, paywalls… :-(
I can’t believe Hofstadter (or anyone, really) is arguing that Bem’s paper should not have been published. The paper was, presumably, published because peer-reviewers couldn’t find substantial flaws in its methodology. This speaks more about the nature of standard statistical practices in psychology, or at least about the peer-review practices at the journal in which Bem’s paper was published, which is useful information in either case.
N.B. It’s Daryl Bem again.
Yup, it’s news because it is actually being published now. The previous discussion was about the preprint paper, which was only discussed in obscure academic and academic-like circles...
I have a bizarre nitpick.
Hofstadter writes:
Why add the detail of incompatibility with the laws of physics? The predictions made from laws of physics alone (in 13-ish situations), and so the model conferred by the laws of physics, would err if there was this additional bizarre factor, but among all this confusion one can’t make conclusive statements about the laws of physics. The strangeness could well be implemented within physics as we know it.
The charitable interpretation of Hofstadter’s comment is that the likelihood of 13-been-unlucky is so low that we should look extra hard for flaws in the arguments of papers purporting to prove it than we would for less controversial papers. He seems to be suggesting that a more rigorous review would have meant the paper would not be published, or at least not published ‘prematurely’. Sounds sensible.
A.K.A. “extraordinary claims require extraordinary evidence”. :)