Also, any recs on dev econ textbooks?
Yes, they’ve made it very clear that that’s the reasoning, and I am saying I disagree.
A) I still think they are not correct (long evidence below)B) Ct values are clearly somewhat useful, and the question is how much—and I do not think the public health comms apparatus should stifle somewhat-useful medical information reaching patients or doctors just because I might be misled. That’s just way too paternalistic.
As to why I think they’re wrong, I’ll cross-post from my fb thread against the specific pdf linked in op, though all other arguments seem isomorphic afaict. If you don’t trust my reasoning but want the reasoning of medical professionals, skip to the bottom.
Basically, the pdf just highlights a bunch of ways that Ct values aren’t perfectly precise and reliable. It says nothing about the relative size of the error bars and the signal, and whether the error bars can drown it out—and, they can’t. To use a very exaggerated metaphor, it’s like the people saying we need to pull J&J because it’s not “perfectly safe” without at all looking at the relative cost/benefit.So, they give a laundry list of factors that will produce variability in Ct values for different measurements of the same sample. But toward the end of the the doc, they proclaim that these sources of variability change the result by up to 2-3 logs, as if this is a damning argument against reporting them. The scale of Ct values is 10 logs. Hospitalized patients vary by 5 logs. That’s so much more signal than their claimed noise! So their one real critique falls very flat.However, they do understate the noise significantly, so we can strengthen their argument. Within-patient variability is already like 2-3 logs, as you can see from data here for example: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7151491/. So variability of viral loads across different patients, different collection methods, and different analysis methods will have more like 4-6 logs of variation. That’s the stronger argument.But even this is ultimately too weak. Most of the variation is on the negative side: there are lots more ways to fail to get a good sample than there are to accidentally find the virus is more concentrated than in reality. So, low Ct values indicating high viral load are still very good signals! I don’t know the exact numbers here because they won’t report them many places, but your reasoning would hypothetically go: If you get a Ct value of under 20, you better start canceling meetings and preparing for a possible hospital visit. If you get a Ct value of 38, maybe it’ll end up getting much worse, or maybe not. Not much information there. This is simple reasoning—doctors do it all the time with other tests with high falsity rates, saying “if you test positive on this you probably have X, but getting a negative doesn’t rule it out.”And aside from this asymmetry, just the correlation is also really useful! I am not the first person to say this: googling turns up a bunch of instances of medical professionals saying similar things:https://www.sciencemag.org/news/2020/09/one-number-could-help-reveal-how-infectious-covid-19-patient-should-test-resultshttps://www.aacc.org/science-and-research/covid-19-resources/statements-on-covid-19-testing/aacc-recommendation-for-reporting-sars-cov-2-cycle-threshold-ct-valueshttps://www.aacc.org/cln/cln-stat/2020/december/3/sars-cov-2-cycle-threshold-a-metric-that-matters-or-nothttps://directorsblog.nih.gov/tag/ct-value/
Basically, the pdf just highlights a bunch of ways that Ct values aren’t perfectly precise and reliable. It says nothing about the relative size of the error bars and the signal, and whether the error bars can drown it out—and, they can’t. To use a very exaggerated metaphor, it’s like the people saying we need to pull J&J because it’s not “perfectly safe” without at all looking at the relative cost/benefit.
So, they give a laundry list of factors that will produce variability in Ct values for different measurements of the same sample. But toward the end of the the doc, they proclaim that these sources of variability change the result by up to 2-3 logs, as if this is a damning argument against reporting them. The scale of Ct values is 10 logs. Hospitalized patients vary by 5 logs. That’s so much more signal than their claimed noise! So their one real critique falls very flat.
However, they do understate the noise significantly, so we can strengthen their argument. Within-patient variability is already like 2-3 logs, as you can see from data here for example: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7151491/. So variability of viral loads across different patients, different collection methods, and different analysis methods will have more like 4-6 logs of variation. That’s the stronger argument.
But even this is ultimately too weak.
Most of the variation is on the negative side: there are lots more ways to fail to get a good sample than there are to accidentally find the virus is more concentrated than in reality. So, low Ct values indicating high viral load are still very good signals! I don’t know the exact numbers here because they won’t report them many places, but your reasoning would hypothetically go: If you get a Ct value of under 20, you better start canceling meetings and preparing for a possible hospital visit. If you get a Ct value of 38, maybe it’ll end up getting much worse, or maybe not. Not much information there. This is simple reasoning—doctors do it all the time with other tests with high falsity rates, saying “if you test positive on this you probably have X, but getting a negative doesn’t rule it out.”
And aside from this asymmetry, just the correlation is also really useful! I am not the first person to say this: googling turns up a bunch of instances of medical professionals saying similar things:
Another sad regulation-induced (and likely public health comms-reinforced) inadequacy: we don’t report Ct values on PCR tests. Ct value stands for cycle threshold, which means how many cycles a PCR amp has to do before the virus is detected. So, it directly measures viral load. But it isn’t reported to us on tests for some reason: here’s an example document saying why it shouldn’t be reported to patients or used to help them forecast progression. Imo a very bad and unnecessary decision.
Another cool data point! I found a paper from Singapore, Jul 2020, testing tear swabs but incidentally giving a bunch of PCR tests too. I’m much more likely to trust a paper that gives PCR tests incidentally, rather than is directly testing their effectiveness with researcher bias toward better results. This paper shows 24⁄108 PCR tests came back negative if I counted correctly: that’s 22% false negative rate (FNR).
Now, for adjustments:
First, these patients were recruited from a hospital. So they obviously have much higher viral load than the average person, so we’d expect higher FNR for the general population. (And we see the expected relationship between viral load and positive results: people with average low Ct values (meaning high viral load) rarely test negative, but those testing negative lots have very high Ct on their positive tests.)
On the other hand, only 2⁄17 patients test negative >50% of the time; a lot of the negatives come near the end of a patient’s sickness or hospital stay. So we don’t see great empirical evidence for the hypothesis that some people are consistent false-negatives. If you take out the negatives-at-the-end effect, there are far fewer false negatives, maybe 5-10%. However, this is basically moot because of the selection effect for the hospitalized as mentioned above. Of course you’ll see hardly any consistent-false-negative-patients in the hospitalized!—the fact you see any macroscopic number of false negatives in the middle of progression is a terrible sign (and, if there were any fully-false-negative patients, we wouldn’t see them anyways! Bad filter).
And we do see the requisite theoretical evidence. Because of the two patients with repeated false negatives and low viral load when positive, we can easily extrapolate that some patients just have slightly lower viral load and test negative consistently.
Overall, there isn’t much easy way to convert this study into “FNR on asymptomatic individuals who get tested”. However, I think if 5-10% of tests on the hospitalized came back negative, that strongly implies more than a 20% FNR on the asymptomatic. I would personally guess that this lends credence toward 10-40% FNR on the symptomatic and 20-80% FNR on asymptomatic. (Lest I double-count evidence, let it be known I’m basing these numbers in part on the above analysis of my personally-known symptomatic individuals with ~40% FNR.)
I have a tentative answer! Some cursory googling makes me think that J&J also just replicates the spike protein in you, the same way Pfizer/Moderna do. This means it’s just strictly less effective. Then you’d want to just do the Pfizer/Moderna one that you haven’t yet—unless Elizabeth’s comment about limited mRNA vaccine doses is decision-relevant, which I still haven’t looked into.
In short, I don’t really know how it can be as bad as I claim it is. It seems like it should straightforwardly be highly accurate because of your two points: the sensitivity should be at a much lower threshold than the amount needed to infect someone.
Yet, I still believe this. Part of this belief is predicated on the heterogeneous results from studies, which make me think that “default” conditions lead to lots of false negatives and later studies showed much lower false negatives because they adjusted conditions to be more sanitary and less realistic. However, this is just an extrapolation, and I haven’t looked into these studies unfortunately.
The bigger reason for my belief is that I’ve seen several people almost-definitely get COVID and then test negative.
First data point: B was out in public unmasked, got it, then their family got it. 4ish people showed symptoms, one didn’t. 2-3 tested positive for COVID, the others didn’t, including 2 who tested negative 3ish times in a row, using PCR. B was one that tested negative repeatedly. B was notified shortly afterward that the person they were with in public had tested positive for COVID.
Second data point: C went back to university in Sep 2020. Two of their family members visited. Shortly after, C got pretty sick and tested positive for COVID. Then their family members got pretty sick with flu-like symptoms. Both family members went to the doctor after symptoms and tested negative by PCR.
Less-strong data point: D flew on a plane from the Bay late-Feb/early-Mar 2020. They landed, two days later they got sick with cough, maybe more, felt pretty bad, and had an spO2 of 85. They tested negative 2-3 times by PCR.
I heard about these cases because they were fairly close to me. There were maybe 2 other cases as close to me as these, so these represent about half my epistemic exposure to COVID cases on the ground.
I don’t know how to possibly parse the first two cases aside from saying that the PCR tests gave false negatives. You can’t even say “they got the flu”—their family members tested positive for COVID! The best alternative explanations seem truly terrible: there’s a minuscule chance I just got a wildly skewed sample, or I could’ve done a truly abysmal job at noticing some selection effect. So I feel like I basically have to take them at face value. Using data points 1 and 2 only, and adding 2 positives from cases close ot me, the PCR tests gave false negatives roughly 7⁄10 + 2⁄3 + 0/2= 9⁄15 times, ~60% false negative rate, but if you count by person this is only ~40% false negative, which is the better way to look at it to correct for the selection effect of people who test negative getting tested repeatedly. Maybe someone got rapid when they thought they got PCR, and you lower 10% to bring the total to 30% false negative rate by person. But not much more in sample mean.
As I said, I don’t really understand how PCR tests can be this bad. However, it would tie up very neatly if, for example, COVID just didn’t make it to the nose in a lot of patients. Perhaps orders of magnitude more is coughed out of the lungs as an infection vector than is exuded from the nostrils. Or perhaps the efficacy of swabbing varies a ton, or the efficacy of testing—a lot of the negative tests I know about were from Red states, and I can’t help but wonder if the old “getting the results you want to get” effect is striking in another wild circumstance (but of course note the selection bias since I know of more cases in Red states).
And even without knowing how PCR tests can be so bad, I don’t feel like I’m going that much out on a limb when I’m imagining there might be lots of heterogeneity in how they’re done. Even if the best tests are really quite good, if the worse tests have user-error rates of one in five, and these are selected to be the ones more in use where COVID outbreaks are (due to culture or the obvious causality), you could potentially have a lot of people with 20% FNR rates. Also, while I think lots of criticisms that “the lab is different than the real world” are misplaced, COVID tests seem like almost a central case where you’d expect that specific failure mode.
(I probably won’t delve into the papers to try to figure this out, but I would love to hear from anyone else who might have alternative hypotheses about this, or reasons why “COVID doesn’t always go to the nose” shouldn’t be the default hypothesis here.)
Oops, thought that was a top-level reply to me when I clicked on it, rather than a reply to Adam. Sorry. Makes more sense in context.
Great paper, thank you!
(Do you mean the Lancet / British intelligence test paper when you say ONS? I embarrassingly don’t see a paper I cited with those letters in it.)
The current way I imagine citing this is to use as a corroboration of my rough estimate of <2% 30yos having Long COVID. I don’t see an easy way to integrate it with IQ loss estimates—since I wouldn’t expect tiny levels of IQ loss to show up on a survey about actual Long COVID symptoms, it seems relatively consistent for 2% symptoms after 12 weeks to still correspond to an average IQ loss of .15 points after 12 weeks (~10% lose 1 IQ point, 1% lose several IQ points). I do think it points downwards somewhat, though, maybe a factor of 2?
I should probably argue with Matt directly, but my brief take is that this is just entirely incompatible with what we see on the ground. The friends of mine who got COVID aren’t reporting 45% chance of their life being 20% worse. That’s… an incredibly massive effect that we would definitely see. Would anyone realistically bet on that?
The important thing about this hypothesis is that it multiplies the effect of all other modifications, like schools or heat, so it’s at least part of the answer whatever the proximal cause is (which I still think is possibly just this + behavioral changes, but do feel a little underwhelmed without some other factor).
Since almost all of these costs are from Long COVID, I think these are actually more like a constant 1/100th of remaining life than 1/100th chance of immediate death. However, since 5% of the cost is from death and another decent chunk is from possibility of CFS, I would understand if you made a small adjustment here. Personally, I don’t think I’m going to increase the estimate of my own risk, though part of that is because I think I was conservative enough that I’d be skewing things if I made even more implicit adjustments toward higher risk.
Yeah, sorry about the confusion about community policy vs individual. I originally tried to give community advice but it got too complicated too quickly, so I just shipped with the individual policy estimate. My ideal way for people to interpret this is something like the precursors for a dividing line between the attractor states of “just act like normal and stay away from your cautious friends” and “just keep being cautious and stay away from all the risky people”. E.g. I think there should be bubbles of people who try not to get COVID and collectively take on less risk than individually optimal, but other people like myself should just take individually-relevant levels of risk and live the next few months in contact with others like ourselves. Ideally I will have a less-simplified version of this written up soon, but that’s definitely questionable.
Actually, this is averaged over all 30yos who get COVID—I realize this was unclear as a summary, I’ll fix. So it’s equivalent to about .6% of them getting horrific CFS and losing half their life-equivalent. (Obv in reality you’re looking at a smooth distribution of badness).
I think this is reasonably supported by my own experience of seeing ~10 30yos getting COVID pre-vaccine and not having any CFS, and now we have 3x less risk with the vaccine.
Right. I think these are all fair, and I’ve tried to take them into account and be pretty conservative in not underestimating the risk. There are obviously a bunch of balancing forces on the opposite side from those you’ve laid out—eg the tipping point effect balances against the “you never notice” effect, otherwise you’re double-counting. In general, I’m trying to find a coherent picture that takes into account the negatives while not overcounting them—it seems very easy to overcount given that all good things correlate and all bad things correlate, but small affects don’t actually send one into a spiral.
To sanity-check my calculations, let’s consider what it would do to increase the amount of weighting to IQ by a factor of 3. I had estimated a .15-point loss in IQ is .5% of your life-equivalent lost, so tripling this would mean 1.5%. Then losing 1 IQ point would be equivalent to 10% of your life, and every fifth of a standard deviation would be a third of your life-equivalent. This would mean at your margins, exercising like two half-hour amounts in a week would cause your life to get 20% better. I think that stretches the bounds of credulity. Even a factor of 2 increase seems to stretch things, unless you’re very comfortable counting 70% of your life’s value as coming from that small edge of intelligence that differentiates eg a better-than-average grad student from an average grad student (maybe IQ 140 vs IQ 130).
Re “this”, I think I meant productivity but had changed the sentence. I’ll edit to fix.
I edited the section to include some more thoughts on the paper’s quality. In brief, I expect a 2x diminishment over time (though this was conservative and it could easily be larger); I expect the selection bias is definitely real, though there’s a countervailing effect from most of the COVID cases being self-reports which I expect means higher conscientiousness and many missed cases from the group otherwise being selected for; I also think the ventilator impairment matches well with other evidence we have about ventilator impairment and is probably not a large overestimate, though this may not shed much light on the smaller disease burden impairment integrity; and I hadn’t noticed the age difference, thanks a lot for pointing that out!
So, to be clear, his all-things-considered view is about 1.5k uCOVIDs cost an hour, which is toward the edge of my range that corresponds to high risk, while his numerical estimate is at 7.5k uCOVIDs costing an hour, which is just outside of my range that corresponds to low risk (but matches almost exactly with my estimates of Long COVID risk outside of follow-ons from cognitive impairment!). So it sounds like initially he disagreed toward higher risk, then found very similar numbers as I did, now only leans toward the high end of my risk estimate due to priors and right-tail uncertainty. (Apologies if my paraphrase does not do your view justice, Adam.)
Hm, I meant to exclude those because of their abysmal sensitivities, but I suppose I should revisit them now in case they’ve gotten better.
No, but I would like to know. The two relevant variables are that mRNA is more effective, which we can sort of quantify, but non-mRNA is probably more different, which I don’t know how to quantify. Currently I view them as roughly equivalent, but an even cursory glance at what was in the mRNA vaccines vs non would potentially be quite helpful.
That’s a really interesting point. I wish we had a conventional way of suspending certain regulations in certain circumstances, rather than having to wait decades for an entirely overhauled piece of legislation on the whole macrotopic. Are there things like, I don’t know, executive non-enforcement orders that ever get used similarly?
Do you know of any non-pooled tests that are cheap and fast, that perhaps a group of individuals could order loads of? I’ve heard people talk about LAMP and such for a while but without any persuasive end-to-end evidence.