Why do you say that? My reaction to that paper was very negative. In large part, it was the anecdotal flavor of the arguments made there, but also because I didn’t see the two things I was specifically looking for:
Citations of studies in which a linear model was constructed using one set of data, and then compared as to performance against the experts using a different set of data.
Failing that, some numbers that would convince me that the failure to test models using different data than was used to construct them just doesn’t matter.
Instead, here and in the 1996 study by Grove & Meehl, I find arguments from incredulity—in effect: “Do our critics really think that this really matters? Don’t be absurd!”. I also notice that this ideology is being promoted by a small number of researchers who repeatedly cite each other’s work, and do not cite critics (except as strawmen).
Like Perplexed, I hated this paper. Of course, it has the very good excuse that it is from 1979. But in 2011, it is sort of expected that you evaluate your model on a second, independent dataset. (My models often crash and burn at this stage.) Did any of these studies do this?
Read the Dawes pdf linked in the top post. I can’t speak for the other examples, but that one is solid.
edit: my apologies, re-reading I see you discussed the marriage example. What is your opinion on the graduate rating and Hodgkin’s disease examples?
Why do you say that? My reaction to that paper was very negative. In large part, it was the anecdotal flavor of the arguments made there, but also because I didn’t see the two things I was specifically looking for:
Citations of studies in which a linear model was constructed using one set of data, and then compared as to performance against the experts using a different set of data.
Failing that, some numbers that would convince me that the failure to test models using different data than was used to construct them just doesn’t matter.
Instead, here and in the 1996 study by Grove & Meehl, I find arguments from incredulity—in effect: “Do our critics really think that this really matters? Don’t be absurd!”. I also notice that this ideology is being promoted by a small number of researchers who repeatedly cite each other’s work, and do not cite critics (except as strawmen).
Like Perplexed, I hated this paper. Of course, it has the very good excuse that it is from 1979. But in 2011, it is sort of expected that you evaluate your model on a second, independent dataset. (My models often crash and burn at this stage.) Did any of these studies do this?