I wrote large chunks of this essay having slept less than 1.5 hours over a period of 38 hours. I came up with and developed the biggest arguments of it when I slept an average of 5 hours 39 minutes per day over the preceding 14 days. At this point, I’m pretty sure that the entire “not sleeping ‘enough’ makes you stupid” is a 100% psyop. It makes you somewhat more sleepy, yes. More stupid, no. I literally did an experiment in which I tried to find changes in my cognitive ability after sleeping 4 hours a day for 12-14 days, I couldn’t find any. My friends who I was talking to a lot during the experiment simply didn’t notice anything.
I find it implausible that your anecdotes and a non-RCT N=1 self-experiment provide stronger evidence than several N≈20 non-pre-registered RCTs.
Yes, p-hacking and lack of pre-registration are bad, but IMO those things are pretty much negligible concerns when studies test cognition with several tests and find the same effect on almost all of them.
When I read the literature on the cognitive effects of sleep deprivation, it doesn’t sound like experimenters are giving subjects several tests, finding no effect on most of them, and then focusing on reporting the single p<0.05 result. Rather, they find medium-to-large effect sizes on most things they test, and effect sizes almost always have the same sign—very rarely favoring sleep deprivation. That could be because the experimenters are giving subjects 15 tests and only report the results of 5 of them, but it sounds unlikely that all sleep researchers involved in these studies are doing that. So it doesn’t look like p-hacking or lack of pre-registration are driving the huge effect sizes here.
It sounds a lot more plausible that an N=1 self-experiment has fatal flaws to it, especially if the one study subject wants the results to come out a certain way.
ETA: I no longer endorse the fact that I linked to that specific meta-analysis. See my new post for a better investigation of the effects of sleep restriction on cognition.
I want to emphasize that I agree that we ought to prefer study data to N=1 motivated self-experiment, and I appreciate you bringing this meta-analysis to the table.
Now, let’s take a close look at Pilcher and Huffcutt.
Guzey’s self-administered study would be coded by P&H as “partial sleep deprivation” (< 5 hours sleep in a 24 hour period), the group showing the largest negative effect sizes in P&H. Note, however, that P&H are aggregating results from just 6 studies (citation numbers 39-41,44,48,49).
The experimental groups of these 6 studies included:
Anesthesia residents after 24 hours of in-house call.
Medical interns at work.
Periodic testing of graduate students at a psychological research center over 8 months, while the students lived an otherwise normal life.
Soldiers on an 8-day field artillery trial manually handling a large quantity of artillery shells (weighing 45 kg) and charges (13 kg).
Surgical resident’s who’ve been up working all night.
Internal medicine residents after a night on call.
In all but the trial on the graduate students, the subjects were dealing with both sleep deprivation and work-related fatigue. Guzey’s specifically interested in sleep deprivation per se, rather than the effects of work-related fatigue. Note that in the (n=8) study of the graduate students, “It was concluded that 6-8 months of gradual sleep restriction, down to 4.5-5.5 hrs per night, does not result in behavioral effects measurable by the instruments used… At the end of an additional 12-month follow-up period, total sleep time was still 1-2.5 hrs below baseline, but measures of well-being had returned to baseline levels.”
How could we get such clear and obvious results from the fatigue studies on doctors, and see nothing in the graduate students? I think work-related fatigue in the hospital is the culprit.
In addition, the inclusion of the soldier study is inexplicable. It wasn’t a study of sleep loss, but of the impact of sustained manual labor on muscular strength and endurance. The control and experimental group had no statistically significant differences in sleep deprivation, though both were sleep-deprived. I can only interpret the meta-study as using the effect of sustained manual labor under (uniform) conditions of sleep deprivation as a proxy for the effects of sleep deprivation itself.
Here’s an outline of the study itself. Group A sleeps 4 hours per night and loads artillery shells, group B sleeps 4 hours per night and doesn’t load any artillery shells. When Group A experiences a greater loss in muscular strength and endurance than Group B after 8 days of this regimen, the sleep researchers interpret this as the result of the sleep deprivation (which was identical to Group B’s), rather than the manual labor (the actual experimental intervention). So that makes no sense. Rather than inexplicable, perhaps I should have said inexcusable.
Considering the undoubtedly important confounding from the intense work environments of the subjects and that one of the six studies in this wing of the meta-analysis doesn’t belong there at all, I think this meta-analysis cannot shed as much light on Guzey’s research question as we might hope. In fact, the most useful part is the study of the graduate students, as it’s likely the least confounded by the effects of a high-stress and taxing work environment. It’s this one that finds effects that match Guzey’s claim.
All this is to highlight the importance of the specific research question and methods. Fatigue from working in a stressful or physically demanding environment is not the same thing as the effects of sleep deprivation. If we are interested in the latter, we should not permit the former to confound our study. This isn’t meant as an attack on you at all, but I think that the impulse to use the results of a meta-study of “sleep deprivation” that is actually a meta-study of work-related fatigue in a high-stress environment with 1⁄6 erroneous inclusions is indicative of the problems Guzey’s highlighting in his overall interpretation and critique of sleep studies.
As a meta point, the fact that a journal in the field is willing to publish such a meta-study that draws misleading conclusions is evidence for Guzey’s claim that the field is cargo-culting and not trustworthy.
I find it implausible that your anecdotes and a non-RCT N=1 self-experiment provide stronger evidence than several N≈20 non-pre-registered RCTs.
Yes, p-hacking and lack of pre-registration are bad, but IMO those things are pretty much negligible concerns when studies test cognition with several tests and find the same effect on almost all of them.
When I read the literature on the cognitive effects of sleep deprivation, it doesn’t sound like experimenters are giving subjects several tests, finding no effect on most of them, and then focusing on reporting the single p<0.05 result. Rather, they find medium-to-large effect sizes on most things they test, and effect sizes almost always have the same sign—very rarely favoring sleep deprivation. That could be because the experimenters are giving subjects 15 tests and only report the results of 5 of them, but it sounds unlikely that all sleep researchers involved in these studies are doing that. So it doesn’t look like p-hacking or lack of pre-registration are driving the huge effect sizes here.
It sounds a lot more plausible that an N=1 self-experiment has fatal flaws to it, especially if the one study subject wants the results to come out a certain way.
ETA: I no longer endorse the fact that I linked to that specific meta-analysis. See my new post for a better investigation of the effects of sleep restriction on cognition.
I want to emphasize that I agree that we ought to prefer study data to N=1 motivated self-experiment, and I appreciate you bringing this meta-analysis to the table.
Now, let’s take a close look at Pilcher and Huffcutt.
Guzey’s self-administered study would be coded by P&H as “partial sleep deprivation” (< 5 hours sleep in a 24 hour period), the group showing the largest negative effect sizes in P&H. Note, however, that P&H are aggregating results from just 6 studies (citation numbers 39-41,44,48,49).
The experimental groups of these 6 studies included:
Anesthesia residents after 24 hours of in-house call.
Medical interns at work.
Periodic testing of graduate students at a psychological research center over 8 months, while the students lived an otherwise normal life.
Soldiers on an 8-day field artillery trial manually handling a large quantity of artillery shells (weighing 45 kg) and charges (13 kg).
Surgical resident’s who’ve been up working all night.
Internal medicine residents after a night on call.
In all but the trial on the graduate students, the subjects were dealing with both sleep deprivation and work-related fatigue. Guzey’s specifically interested in sleep deprivation per se, rather than the effects of work-related fatigue. Note that in the (n=8) study of the graduate students, “It was concluded that 6-8 months of gradual sleep restriction, down to 4.5-5.5 hrs per night, does not result in behavioral effects measurable by the instruments used… At the end of an additional 12-month follow-up period, total sleep time was still 1-2.5 hrs below baseline, but measures of well-being had returned to baseline levels.”
How could we get such clear and obvious results from the fatigue studies on doctors, and see nothing in the graduate students? I think work-related fatigue in the hospital is the culprit.
In addition, the inclusion of the soldier study is inexplicable. It wasn’t a study of sleep loss, but of the impact of sustained manual labor on muscular strength and endurance. The control and experimental group had no statistically significant differences in sleep deprivation, though both were sleep-deprived. I can only interpret the meta-study as using the effect of sustained manual labor under (uniform) conditions of sleep deprivation as a proxy for the effects of sleep deprivation itself.
Here’s an outline of the study itself. Group A sleeps 4 hours per night and loads artillery shells, group B sleeps 4 hours per night and doesn’t load any artillery shells. When Group A experiences a greater loss in muscular strength and endurance than Group B after 8 days of this regimen, the sleep researchers interpret this as the result of the sleep deprivation (which was identical to Group B’s), rather than the manual labor (the actual experimental intervention). So that makes no sense. Rather than inexplicable, perhaps I should have said inexcusable.
Considering the undoubtedly important confounding from the intense work environments of the subjects and that one of the six studies in this wing of the meta-analysis doesn’t belong there at all, I think this meta-analysis cannot shed as much light on Guzey’s research question as we might hope. In fact, the most useful part is the study of the graduate students, as it’s likely the least confounded by the effects of a high-stress and taxing work environment. It’s this one that finds effects that match Guzey’s claim.
All this is to highlight the importance of the specific research question and methods. Fatigue from working in a stressful or physically demanding environment is not the same thing as the effects of sleep deprivation. If we are interested in the latter, we should not permit the former to confound our study. This isn’t meant as an attack on you at all, but I think that the impulse to use the results of a meta-study of “sleep deprivation” that is actually a meta-study of work-related fatigue in a high-stress environment with 1⁄6 erroneous inclusions is indicative of the problems Guzey’s highlighting in his overall interpretation and critique of sleep studies.
As a meta point, the fact that a journal in the field is willing to publish such a meta-study that draws misleading conclusions is evidence for Guzey’s claim that the field is cargo-culting and not trustworthy.
Your first (“huge”) and third (“sizes”) links are broken.
A measurement showing a correlation (or lack thereof) in the population does not exclude other correlations in subsets of the population.