Fast evidence assessment

Recently I’ve been thinking a lot about how to assess evidence quickly. We usually don’t have time to do a deep dive into everything we hear but there are quick intuitions which can be applied to get an idea of how much a piece of evidence should change our mind. I’ve written about this previously in the context of scientific papers but I think the rules are more generalizable than that.

So this is an attempt to get as much out of a piece of evidence with as little effort as possible and with knowing as little statistics as possible.

Below is one intuition I find useful. I have a few others I could write up and will judge reaction to this one as to whether they’re worth writing.

Large Effects Can’t Prove Small Causes

(Note: When I say “small cause” I mean a cause which would be expected to create a small effect.)

Say you have some evidence of a very large effect and a proposed cause. The first thing to ask yourself is not whether you believe the cause could create some effect, but whether the proposed cause could possibly cause an effect that big.

Does it seem to contradict anything else which you think you know about the world?

If it were true then what other evidence would you expect to see?

If there is no corresponding evidence then the original evidence was probably caused by something else.

You don’t even need to know what the other cause might be (although it helps if you can). So long as you can confidently assess that the proposed cause isn’t anywhere near sufficient to cause such a large effect, the evidence provides negligible confirmation that the proposed cause is creating any of the effect. Any effect that the proposed cause is having is being drowned out by noise created from other causes.

I don’t think there’s anything massively new here and I think people have a sense that this is true (Putanumonit has also discussed something similar in the context of scientific papers). However I think it’s easy to forget to apply it, especially to informal evidence.

I’ve had cause to use this rule at least 3 times in the last few months on COVID related data which hopefully illustrate the point.

COVID cell phones

A friend mentioned to me that there had been a 21 million drop in the number of Chinese cell phone contracts between November 2019 and February 2020. They presented this as evidence that the Chinese government was covering up many deaths due to COVID.

If there had been 21 million deaths I don’t think that the government would be able to cover this up and claim there were only 3,000. China is a big country but this would still be 1.5% of the population. The government may have a tight hold on communication but really I can’t believe anything like an effect this size—there would surely be other evidence of a huge, huge, conspiracy coming out.

It seems like the people who were talking about the drop in contracts acknowledged this – one was quoted as saying: “At present, we don’t know the details of the data. If only 10 percent of the cellphone accounts were closed because the users died because of the… virus, the death toll would be 2 million.” My friend said something similar.

This seems at first glance like a reasonable concession but it doesn’t work. It takes a very strange kind of confidence to claim that other effects could cause 19M fewer contracts but they couldn’t possibly have caused all 21M fewer contracts.

So my conclusion was that the cell phone contract cancellations do not provide any evidence for the Chinese government having hidden some coronavirus deaths (obviously it doesn’t disprove it either!).

It turned out that these cancellations were probably mainly due to people cancelling second phone contracts – see e.g. snopes.

COVID reinfections

(See here for original comment)

A while ago there was concern that some recovered COVID patients from South Korea had been reinfected.

The numbers were worrying – 163 patients having tested positive after having recovered.

But again the numbers were too high. If these were reinfections they would have implied that recovered patients were getting infected at a rate ~2,000 times as high as the general population which seems far too high.

We might say that the real effect of reinfections is smaller and that there are also other causes but, again, as soon as you admit that a large portion of the effect might come from other sources it destroys almost all of the evidence for the original source.

So the effect seems too large to be caused by reinfections and there’s probably another explanation for the evidence. In this case it turned out to be false positives due to inactive virus being detected by the RT-PCR test.

COVID prevalence in Tokyo

(See here for original comment)

More recently a study came out showing that a particular company in Tokyo (spread across 11 sites) had a seropositivity rate (i.e. people who had previously been infected) of 46.8%. Although it was acknowledged that this was only one company it was suggested that as it was tested over multiple sites this might generalise somewhat to the wider Tokyo population.

Again there were very different numbers to consider. A prevalence of 46.8% implies ~6,500,000 cases in the Tokyo prefecture. The official number of infections is ~25,000. With only 6% of tests returning positive it is unlikely that so many cases were being missed.

If the 6.5M is correct it implies everything I thought about COVID IFR, test accuracies, asymptomatic rates and/or reliability of Japanese government statistics is wrong. And not just a little wrong – massively and completely wrong. It’s possible but it’s not likely.

I’m not sure why the study might be giving such different numbers but I Defy The Data!

I’ve always wanted to say that.

Conclusion

In all of these cases we’ve achieved a lot with little effort. I could try to investigate in more detail but time is valuable.

It’s possible that my reasonable effect size estimates are wrong and if I have to be super confident in the result then its worth putting in some extra work.

This rule doesn’t prove the proposed cause has no effect—it just means that any effect from this cause is being drowned out by other causes and negligible useful conclusions about the proposed cause can be drawn from the evidence in question.

Whilst my examples here are all COVID related, I also regularly use this rule for other things:

Assessing evidence from tests at my work
- Could this small manufacturing defect cause such early hour failures?
Assessing other factoids
- Does having a desk in their room increase a student’s chances of wanting to go to university?
- Does a mathematician’s field of study correlate with corn eating habits? (original comment, SSC result)

There is a danger here that this might cause one to anchor too firmly on one’s current beliefs but I think this is avoidable.

You can defy the data on one experiment. You can’t defy the data on multiple experiments. At that point you either have to relinquish the theory or dismiss the data—point to a design flaw, or refer to an even larger body of experiments that failed to replicate the result, or accuse the researchers of a deliberate hoax, et cetera. But you should not turn around and argue that the theory and the experiment are actually compatible. Why didn’t you think of that before you defied the data? Defying the data admits that the data is not compatible with your theory; it sticks your neck way out, so your head can be easily chopped off.

(from I Defy The Data!)