Making Sense of Coronavirus Stats
[Update: Some have pointed out the definition of mortality rate should be that of deaths to some defined population, typically a median estimate over a period of time (week, month, year...) and not limited only to those infected. (New update here. The ratios I have been considering are called Case Fatality Rate.)
It was also correctly pointed out that my second calculation was simply wrong. The ration should not be deaths/recovered but deaths/(deaths+recovered) -- that is deaths to the total population considered. ]
From what I’ve seen WHO and other health organizations are saying the mortality rate for the new coronavirus outbreak is between 2 and 3 percent. That seems to be based on the ratio of the reported deaths to the reported cases of infection.
That doesn’t seem right to me.
The last statistics I looked at (news report) was:
Total − 75,768
Recovered − 16,329 (21.6%)
Deaths − 2129 (2.8%)
However, that leave us with a bit over 75% with an unknown end state.
If I try to infer the outcome for all the reported infections I can think of two ways to estimate the end results. One, is to assuming the remaining cases will produce a similar outcome as has been observed so far. Using that assumption I can then iterate through the unknown cases using the 21.6% percent will recover, 2.8% will die and ~75% will move to the next round.
The other way would be to assume the ratio of the current deaths to current recoveries is a good measure of the mortality rate.
Using the first approach the total deaths approaches 8,740 people from the current population of 75,768. The second approach results in a higher number, 9879. Both of these numbers would suggest this version of the coronavirus is pretty bad, in terms of mortality rates. In the context of the other two coronavirus outbreaks, it seem closer to SARS, though a bit worse, than to MERS.
Should I think this approach to estimating mortality rates for new diseases without out a long history (like the flu) might be more accurate than the standard approach that seems to be deaths/total infections. This standard approach would seem to systematically under estimate the mortality as initially one would expect infections to be rising more rapidly than deaths.
The implication is that health organizations/bureaucracies might tend to be slow to react. The early numbers are nothing to worry about, and might even show a decreasing mortality rate initially. Then as the infected either succumb to the infection or recover mortality rates start rising producing greater concern and calls for action.
That seems to reflect the general response both from China and from WHO in many ways.
So as I’ve been writing this I’ve come to wonder if the way mortality rates are calculated might lead to poor bureaucratic responses. I wonder if we used either of the above, rather than the standard measure, might not be better. At the end of the day, both measures will converge to the same number over time as the daily, weekly, monthly or even annual data points become too small to really move the dial at all.