The novel coronavirus spreading around the globe is now a pandemic in all but name. Iran is estimated (based on cases exported to other countries) to have 18,300 infected. South Korea now has 1700+ cases, Italy has found 655 infected. Around the world — excluding China and the cruise ship Diamond Princess — a large and growing percentage of these detected cases are from local community transmission, and disturbingly from unknown transmission chains.

Graph via @GHoeberX — an excellent source of coronavirus analysis on twitter.





According to the CDC, the spread of the virus in the US appears inevitable. “It’s not a question of if this will happen, but when this will happen and how many people in this country will have severe illnesses,”

And that is the remaining question at this point - how bad will it be when it hits. It’s clear that the virus cannot be contained, but how deadly will it be and what percentage of the population will be infected? There is no clear consensus answer for this yet , but there is reason to be concerned based on the data we do have.





It was hoped initially that the CFR of the virus could be quite low — that a significant number of mild/asymptomatic cases were being missed which would drive down the fatality rate. However, the WHO has said this no longer seems likely:





One of the hopes of people watching China’s coronavirus outbreak was that the alarming picture of its lethality is probably exaggerated because a lot of mild cases are likely being missed. But on Tuesday, a World Health Organization expert suggested that does not appear to be the case…

“So I know everybody’s been out there saying, ‘Whoa, this thing is spreading everywhere and we just can’t see it, tip of the iceberg.’ But the data that we do have don’t support that,” Aylward said during a briefing for journalists at WHO’s Geneva headquarters.

So, if we aren’t missing a bunch of mild cases, what is the Case Fatality Rate (CFR) of the virus?





Case Fatality Rate

In Methods for Estimating the Case Fatality Ratio for a Novel, Emerging Infectious Disease several methodologies for calculating CFR are compared. Fortuitously, these methodologies are compared by how they performed during the 2003 SARS epidemic which, given the virus’s close genetic relationship to SARS, makes the calculations especially relevant.





Among the various methodologies, two provided reasonable values across the course of the epidemic. We will use the formula e2(s)=D(s)/{D(s)+R(s)} where, D(s) and R(s) denote the cumulative number of deaths and recoveries. This methodology also has the added benefit that we do not need to know the date when symptoms began for the fatal cases — information which is not readily available for all cases at this time.





Furthermore, for a more accurate number we the dataset is restricted to deaths and recoveries from countries who scored at least 50 out of 100 on the 2019 Global Health Security Index’s measure of their ability to detect and report emerging epidemics. Note this excludes data from mainland China and the deaths reported in Iran which would otherwise skew the CFR significantly higher.





Based on this methodology, with 39 deaths and 265 recoveries, this gives us a CFR of 12.83% (12.20%- 13.65% Confidence Interval: 95%). NOTE — this CFR estimate, and other important factors for the virus are updated daily here on my COVID-19 tracker. This estimate is closer to what was seen in the Chinese city of Wuhan — rather than the mild virus the world hoped for.





This estimate has risen considerably as more data has come in from outside mainland china.

Optimistically, perhaps as the virus spreads to a new country the first thing they notice is deaths - missing recoveries and temporarily skewing the number higher? Pessimistically, I note that the estimates for SARS initially underestimated the CFR and rose to the true higher value with more data. See the graph below from Methods for Estimating the Case Fatality Ratio for a Novel, Emerging Infectious Disease which examined how different methodologies for calculating CFR preformed during the SARS outbreak.

It is still early in the outbreak, I hope that in this estimate is skewed high. But it’s clear that with the data we have currently there is good reason to be concerned about what the ultimate outcome of the pandemic will be.





Changes to the estimated CFR and the R0 values are updated at least daily here.





A speculative SIRD model projecting the spread on the virus in the US will be updated fairly regularly as data comes in on this page.





On twitter @joshuafkon