Julien Riou writes:

Stan epidemiologist here. We actually just released a preprint [estimating death rates of people infected with coronavirus, breaking down the population by age and then poststratifying] using Stan (https://www.medrxiv.org/content/10.1101/2020.03.04.20031104v1). Crude estimates of case fatality ratio obtained by dividing observed deaths by observed cases are biased in two ways: 1) Deaths are underestimated because of the delay between disease onset and death (right censoring); 2) Total cases are underestimated because surveillance efforts focus on severe cases and miss asymptomatic and mild cases. We attempted to correct for both these biases using data from China and a few assumptions. It might still need some refinement though, happy to hear any comment.

And here’s the paper by Julien Rieu, Anthony Hauser, Michel Counotte, and Christian Althaus:

We [Riou et al.] estimated the age-specific case fatality ratio (CFR) by fitting a transmission model to data from China, accounting for underreporting of cases and the time delay to death. . . . We find that 1.6% (1.4-1.8) of individuals infected with COVID-19 [in Hubei between 1 Jan and 11 Feb] with or without symptoms died or will die, with even more important differences by age group than suggested by the raw data. The probability of death among infected individuals with symptoms is estimated at 3.3% (2.9-3.8), with a steep increase over 60 years old to reach 36% over 80 years old.

The narrowness of these intervals implies that these are inferential uncertainties conditional on the model and do not account for uncertainty in the model itself.

Here are some more graphs from the paper:

Strengths of the analysis

Here’s how Rieu et al. describe the strengths of their work:

(1) We use a mechanistic model for the transmission of and the mortality associated with COVID-19 that is a direct translation of the data-generating mechanisms leading to the biased observations of the number of deaths (because of right-censoring) and of cases (because of surveillance bias). Our model also accounts for the effect of control measures on disease transmission. (2) Our model is stratified by age group, which has been shown as a crucial feature for modelling emerging respiratory infections [16]. (3) The estimates rely on routinely collected surveillance data such as incident cases by disease onset, incidence deaths, and the age distribution of cases and deaths, and does not require individual-level data nor studies in the general population.

Limitations

The paper continues with a list of limitations:

(1) Our results depend on the central assumption that the cause of the deficit of reported cases among younger age groups is a surveillance bias and does not reflect a lower risk of infection in younger individuals. The reason for this age shift is unknown [10]. Retrospective testing for COVID-19 of samples from influenza-like-illness surveillance found no positive test among children, but the sample sizes were small (20 per week including both adults and children) [10]. Uneven age distributions in the risk of infection can be attributed to immunological features, such as the lower circulation of H1N1 influenza in older individuals due to residual immunity [17]. An immunological explanation of the opposite phenomenon, with a lower susceptibility of younger individuals, seems unlikely, and there is no indication of pre-existing immunity to COVID-19 in humans [10]. Different contact patterns could play a role in a limited outbreak, but not in such a widespread infection, especially as household transmission seems to play a major role [10]. The last explanation that we assume here is that younger individuals, when symptomatic, have milder symptoms that decrease the probability of seeking care and being identified. (2) In a related matter, our results depend on the assumption that older individuals have more severe symptoms and are more likely to be identified. In the absence of an outside reference point, the reporting rate cannot be estimated from surveillance data only. We chose to fix to 100% the reporting rate of infected individuals that have symptoms and are aged 80 and more, and estimate the reporting rates in other age groups relatively to that of older individuals. If further data, coming from a study in the general population, shows that this assumption is violated, this would lead to an overestimation of the CFR in our study. (3) There is important uncertainty around the proportion of asymptomatic infections. Currently, the detection of asymptomatic patients in China is limited by the focus on symptomatic patients seeking care and the lack of seroprevalence data [18]. The proportion of symptomatic infections has been estimated to 58% (95% confidence interval: 33-83) in a small sample of cases exported to Japan [19]. During the outbreak on the ship “Diamond Princess”, nearly all individuals were tested regardless of symptoms, leading to an average proportion of symptomatic infections of 49% in a sample size of 619, which was used in the present study [13]. Still, uncertainty about the proportion of symptomatic infections will remain until a large retrospective seroprevalence study is conducted in the general population, and our results are dependent on this estimate. Additionally, the dichotomization of infection into asymptomatic and symptomatic is a simplification of reality; the infection with SARS-CoV-2, will likely cause a gradient of symptoms in different individuals depending on age, sex and comorbidities [10]. The proportion of asymptomatic infections might show an age-dependent structure. (4) Our findings regarding the CFR are specific to the context, and should be interpreted in that light. The findings describe the situation in Hubei from 1 January to 11 February, 2020. It was demonstrated there, that mortality rates have changed over time as a result of an improvement of the standard of care [10]. The standard of care and, as a result, the CFR is setting-dependent and cannot be directly applied to other contexts.

I have not read this paper carefully or tried to evaluate their model or their claims.

P.S. Riou adds:

All data and code is available on https://github.com/jriou/covid_adjusted_cfr, the stan code is in model/model10.stan and the data is obtained and formatted in run_models.R. The model is computationally intensive, about 1 day on our cluster here in Switzerland. The idea of the model is some kind of post-stratification, backed by an epidemic SIR-type model implemented in ODEs. I made some improvement on model10, including another source for the proportion of symptomatics (and using a distribution not a fixed proportion), and adding data about the contacts between age groups in China. It’s still running, I will upload it to github now (model11 and model12).

P.P.S. Someone pointed me to another paper that uses a differential equation analysis, “Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus,” by Ruiyun Li, Sen Pei, Bin Chen, Yimeng Song, Tao Zhang, Wan Yang, and Jeffrey Shaman. I haven’t read this one carefully either. Data and code are here, and this is their summary:

Estimation of the prevalence and contagiousness of undocumented novel coronavirus (SARS-CoV2) infections is critical for understanding the overall prevalence and pandemic potential of this disease. Here we use observations of reported infection within China, in conjunction with mobility data, a networked dynamic metapopulation model and Bayesian inference, to infer critical epidemiological characteristics associated with SARS-CoV2, including the fraction of undocumented infections and their contagiousness. We estimate 86% of all infections were undocumented (95% CI: [82%-90%]) prior to January 23, 2020 travel restrictions. Per person, these undocumented infections were 55% as contagious as documented infections ([46%-62%]) and were the source of infection for two-thirds of documented cases. These findings explain the rapid geographic spread of SARS-CoV2 and indicate containment of this virus will be particularly challenging.

Jeff Shaman, one of the authors of that paper, writes:

I find it hard to believe that 1.6% of all infections died—the case fatality rate in China was 2.38% as of February 11th, the majority of that driven by activity in Hubei. The rapid spread of the virus geographically, serological evidence from evacuees, and our own work, indicate that only 10-20% of infections were confirmed cases. The CFR is deaths/confirmed cases. If one-tenth to one-fifth of infections are confirmed cases that implies 0.238% – 0.476% of all infections died.

P.P.P.S. More here.