Table of Contents
Fetching ...

An unbiased estimator of the case fatality rate

Agustín Alvarez, Marina Fragalá, Marina Valdora

TL;DR

An unbiased estimator of the case fatality rate of a virus is presented, based on the distribution F of the time between confirmation and death of individuals who die because of the virus, and its asymptotic distribution is derived, enabling the corresponding confidence intervals to be established.

Abstract

During an epidemic outbreak of a new disease, the probability of dying once infected is considered an important though difficult task to be computed. Since it is very hard to know the true number of infected people, the focus is placed on estimating the case fatality rate, which is defined as the probability of dying once tested and confirmed as infected. The estimation of this rate at the beginning of an epidemic remains challenging for several reasons, including the time gap between diagnosis and death, and the rapid growth in the number of confirmed cases. In this work, an unbiased estimator of the case fatality rate of a virus is presented. The consistency of the estimator is demonstrated, and its asymptotic distribution is derived, enabling the corresponding confidence intervals (C.I.) to be established. The proposed method is based on the distribution F of the time between confirmation and death of individuals who die because of the virus. The estimator's performance is analyzed in both simulation scenarios and the real-world context of Argentina in 2020 for the COVID-19 pandemic, consistently achieving excellent results when compared to an existing proposal as well as to the conventional \naive" estimator that was employed to report the case fatality rates during the last COVID-19 pandemic. In the simulated scenarios, the empirical coverage of our C.I. is studied, both using the F employed to generate the data and an estimated F, and it is observed that the desired level of confidence is reached quickly when using real F and in a reasonable period of time when estimating F.

An unbiased estimator of the case fatality rate

TL;DR

An unbiased estimator of the case fatality rate of a virus is presented, based on the distribution F of the time between confirmation and death of individuals who die because of the virus, and its asymptotic distribution is derived, enabling the corresponding confidence intervals to be established.

Abstract

During an epidemic outbreak of a new disease, the probability of dying once infected is considered an important though difficult task to be computed. Since it is very hard to know the true number of infected people, the focus is placed on estimating the case fatality rate, which is defined as the probability of dying once tested and confirmed as infected. The estimation of this rate at the beginning of an epidemic remains challenging for several reasons, including the time gap between diagnosis and death, and the rapid growth in the number of confirmed cases. In this work, an unbiased estimator of the case fatality rate of a virus is presented. The consistency of the estimator is demonstrated, and its asymptotic distribution is derived, enabling the corresponding confidence intervals (C.I.) to be established. The proposed method is based on the distribution F of the time between confirmation and death of individuals who die because of the virus. The estimator's performance is analyzed in both simulation scenarios and the real-world context of Argentina in 2020 for the COVID-19 pandemic, consistently achieving excellent results when compared to an existing proposal as well as to the conventional \naive" estimator that was employed to report the case fatality rates during the last COVID-19 pandemic. In the simulated scenarios, the empirical coverage of our C.I. is studied, both using the F employed to generate the data and an estimated F, and it is observed that the desired level of confidence is reached quickly when using real F and in a reasonable period of time when estimating F.

Paper Structure

This paper contains 11 sections, 2 theorems, 32 equations, 9 figures, 2 tables.

Key Result

Theorem 1

For each $t\in\mathbb{N}$, let $\{D_{d,i}(t)\}_{d,i}$ for $0\le d\le t$ and $1\le i\le c_d$ be independent random variables ${Be(p_d F_d(t-d))}$. Assume that the total number of confirmed cases until day $t$, Then $(i)$ If A1 holds, then $CFR(t)-cfr(t) \buildrel{p}\over\longrightarrow 0\,,$$(ii)$ If A1 to A3 hold, then $\frac{CFR(t)-cfr(t)}{\sqrt{\mathbb{V}(CFR(t))}}\stackrel{D}{\longrightarrow

Figures (9)

  • Figure 1: Daily cases in thousands in India (left) and Argentina (right).
  • Figure 2: Proportion of confirmed cases which finally died among those cases confirmed between days $d-3$ and $d+3$ in the first $400$ days of the epidemic in Argentina
  • Figure 3: Functional boxplots of $CFR(t)$ (top left), $CFR_G(t)$ (top right), $CFR_N(t)$ (bottom left) and $CFR_F(t)$ (bottom right). The parameters used are: Argentine $c_d$, $\mu=12.6$, abrupt $p_d$. The estimation is made using $\hat{F}_{\textsc{emp}}$ with $t_1=45$ and $t_{back}=45$.
  • Figure 4: Functional boxplots of $CFR(t)$ (top left), $CFR_G(t)$ (top right), $CFR_N(t)$ (bottom left) and $CFR_F(t)$ (bottom right). The parameters used are: Indian $c_d$, $\mu=6$ and Argentine $p_d$. The estimation is made using $\hat{F}_{\textsc{emp}}$ with $t_1=30$ and $t_{back}=23$.
  • Figure 5: Finite sample bias (left) and mean squared error multiplied by $10^4$ (right). Black curve corresponds to $CFR_N(t)$, red curve to $CFR(t)$ and green curve to $CFR_G(t)$. The parameters used are: Argentine $c_d$, $\mu=12.6$, abrupt $p_d$. The estimation is made using $\hat{F}_{emp}$ with $t_1=45$ and $t_{back}=45$.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem