Does the quirk in the numbers make a difference to the core claims that the authors of the study are probing?
I don’t currently have access to the paper so I can’t answer this question very effectively for myself, but from work experience around science it seems that most real world knowledge work involves a measure of accident (oops, we dropped the sample) and serendipity (I noticed X which isn’t part of our grant but is fascinating and might lead somewhere). Generally the good and bad parts of this relatively human process are not always visible in actual papers, but even if the bad parts show up in the data they don’t always make much difference to the question you’re really interested in.
If you constrain your substantive claims to being consistent with both the things the data shows and the quirks you know about from doing the experiments then these flaws should do no significant harm to the broader scientific community’s understanding of the world. This is actually a place where a kind of encapsulation seems like a really good thing to me because it allows useful and relatively trustworthy information to gain broader currency while the idiosyncratic elements of the processes that produce the information are suppressed.
In the meantime, if the data tells the reader something that the authors don’t talk about, it represents an opportunity to initiate a conversation with the corresponding author. They might be able to tell you an amusing story about a “dropped sample” that doesn’t affect the core claims but explains whatever was puzzling you. Alternatively, the issue might be something really significant—and maybe it will lead to a collaboration or something :-)
Admittedly, the paper you’re pointing to may be a really be an obvious case of clearly flawed data analysis that somehow slipped through peer review, but its really hard to say without being able to see the paper.
I’m afraid it’s critical. They claim to have shown that their treatment allows rats to remember objects for up to 24 weeks after seeing them, while normal rats could remember objects for only 45 minutes. They did a long series of tests with the treated rats to demonstrate this. But they appear to have stopped testing the normal rats after they “failed” the 60-minute test.
Similarly, they tested the treated rats for memory of up to 6 objects. They stopped testing the normal rats after they “failed” the test for 4 objects (figure 1D) - although, again, they spent no more time examining the old objects in that test; they merely spent less time examining new objects.
If the treatment had any effect, it appears to me that it affected the rats’ curiousity, not their memory. But the most likely explanation is that the normal rats failed those 2 critical tests purely by chance, perhaps because something startled them (e.g., there was a hawk overhead during the exposure to the new objects).
The sample size is 16, which should be enough. A random factor shouldn’t have that big an effect if the trials were uncorrelated. To make the trials uncorrelated, they would need to interleave their trials. For instance, if they do all the 30-minute tests, then all the 45-minute tests, then all the 60-minute tests, each group of tests is almost completely correlated in environmental conditions. Because rats have such different senses than humans, it’s impossible for a human to tell by observation whether something unusual to a rat is going on.
The expectation of the experimenter is another factor that can correlate results within a trial group. We usually require double-blind tests on humans, but not on rats.
Does the quirk in the numbers make a difference to the core claims that the authors of the study are probing?
I don’t currently have access to the paper so I can’t answer this question very effectively for myself, but from work experience around science it seems that most real world knowledge work involves a measure of accident (oops, we dropped the sample) and serendipity (I noticed X which isn’t part of our grant but is fascinating and might lead somewhere). Generally the good and bad parts of this relatively human process are not always visible in actual papers, but even if the bad parts show up in the data they don’t always make much difference to the question you’re really interested in.
If you constrain your substantive claims to being consistent with both the things the data shows and the quirks you know about from doing the experiments then these flaws should do no significant harm to the broader scientific community’s understanding of the world. This is actually a place where a kind of encapsulation seems like a really good thing to me because it allows useful and relatively trustworthy information to gain broader currency while the idiosyncratic elements of the processes that produce the information are suppressed.
In the meantime, if the data tells the reader something that the authors don’t talk about, it represents an opportunity to initiate a conversation with the corresponding author. They might be able to tell you an amusing story about a “dropped sample” that doesn’t affect the core claims but explains whatever was puzzling you. Alternatively, the issue might be something really significant—and maybe it will lead to a collaboration or something :-)
Admittedly, the paper you’re pointing to may be a really be an obvious case of clearly flawed data analysis that somehow slipped through peer review, but its really hard to say without being able to see the paper.
I’m afraid it’s critical. They claim to have shown that their treatment allows rats to remember objects for up to 24 weeks after seeing them, while normal rats could remember objects for only 45 minutes. They did a long series of tests with the treated rats to demonstrate this. But they appear to have stopped testing the normal rats after they “failed” the 60-minute test.
Similarly, they tested the treated rats for memory of up to 6 objects. They stopped testing the normal rats after they “failed” the test for 4 objects (figure 1D) - although, again, they spent no more time examining the old objects in that test; they merely spent less time examining new objects.
If the treatment had any effect, it appears to me that it affected the rats’ curiousity, not their memory. But the most likely explanation is that the normal rats failed those 2 critical tests purely by chance, perhaps because something startled them (e.g., there was a hawk overhead during the exposure to the new objects).
Back up—are you suggesting that a random factor could have that big an effect on the results? How small are their sample sizes?
The sample size is 16, which should be enough. A random factor shouldn’t have that big an effect if the trials were uncorrelated. To make the trials uncorrelated, they would need to interleave their trials. For instance, if they do all the 30-minute tests, then all the 45-minute tests, then all the 60-minute tests, each group of tests is almost completely correlated in environmental conditions. Because rats have such different senses than humans, it’s impossible for a human to tell by observation whether something unusual to a rat is going on.
The expectation of the experimenter is another factor that can correlate results within a trial group. We usually require double-blind tests on humans, but not on rats.