Household size can explain 40% of the variance in cumulative COVID-19 incidence across Europe

Seba Contreras; Philipp Dönges; Maciej Filinski; Joel Wagner; Viktor Bezborodov; Marcin Bodych; Barbara Pabjan; Franciszek Rakowski; Jan Pablo Burgard; Tyll Krueger; Viola Priesemann

Household size can explain 40% of the variance in cumulative COVID-19 incidence across Europe

Seba Contreras, Philipp Dönges, Maciej Filinski, Joel Wagner, Viktor Bezborodov, Marcin Bodych, Barbara Pabjan, Franciszek Rakowski, Jan Pablo Burgard, Tyll Krueger, Viola Priesemann

TL;DR

The study asks how household size structures modulate COVID-19 spread across Europe. It develops a framework that separates out-household from within-household transmission, introducing the boost factor $\mathcal{B}$ and the household reproduction number $\mathcal{R}_H$, linking prevalence to the out-household reproduction number $\bar{R}_{\rm out}$. Using country-specific CIFRs, gamma-extended deaths, and Eurostat household distributions, it finds that the effective household size $\eta^{*}$ explains about 40.6% of the variance in cumulative incidence across 34 European countries (95% CI: [14.7%, 46.1%]); the observed prevalence strongly correlates with $\eta^{*}$ (Pearson $r = 0.69$, $p = 4.2\times 10^{-6}$). The findings imply that larger household structures act as a structural disadvantage, requiring stronger cross-household NPIs to achieve comparable containment, and show that household structure can confound associations with development indices like the HDI, underscoring the need to account for structural demographics in pandemic policy.

Abstract

Household size impacts the spread of respiratory infectious diseases: Larger households tend to boost transmission by acquiring external infections more frequently and subsequently transmitting them back into the community. Furthermore, mandatory interventions primarily modulate contagion between households rather than within them. We developed an approach to quantify the role of household size in epidemics by separating within-household from out-household transmission, and found that household size explains 41% of the variability in cumulative COVID-19 incidence across 34 European countries (95% confidence interval: [15%, 46%]). The contribution of households to the overall dynamics can be quantified by a boost factor that increases with the effective household size, implying that countries with larger households require more stringent interventions to achieve the same levels of containment. This suggests that households constitute a structural (dis-)advantage that must be considered when designing and evaluating mitigation strategies.

Household size can explain 40% of the variance in cumulative COVID-19 incidence across Europe

TL;DR

and the household reproduction number

, linking prevalence to the out-household reproduction number

. Using country-specific CIFRs, gamma-extended deaths, and Eurostat household distributions, it finds that the effective household size

explains about 40.6% of the variance in cumulative incidence across 34 European countries (95% CI: [14.7%, 46.1%]); the observed prevalence strongly correlates with

(Pearson

). The findings imply that larger household structures act as a structural disadvantage, requiring stronger cross-household NPIs to achieve comparable containment, and show that household structure can confound associations with development indices like the HDI, underscoring the need to account for structural demographics in pandemic policy.

Abstract

Paper Structure (13 sections, 16 equations, 12 figures, 5 tables)

This paper contains 13 sections, 16 equations, 12 figures, 5 tables.

Extended methods
Prevalence estimation
Estimation of country-specific infection fatality rates (CIFRs)
Estimation of $\gamma$-extended COVID-19 deaths
Estimation of $\gamma$
Comparison of excess mortality estimates and $\gamma$-extended COVID-19 deaths
Social determinants of household size
Economic factors
Cultural factors
Institutional factors
Robustness check
Supplementary Tables
Supplementary Figures

Figures (12)

Figure 1: Disease spread from the perspective of households.A. Disease spread can be categorized into the contagion occurring outside and inside of households. $\rho_i$ quantifies the number of potentially contagious contacts of an individual $i$. The out-household reproduction number $\bar{R}_{\rm out}$ is defined as the mean $\rho_i$, averaged over all infectious individuals in an outbreak $I$. Once an infectious individual enters a household of size $k$, the in-household secondary attack rate $a_{\small H}$ quantifies the in-household outbreak as the fraction of the $k-1$ members who are infected. The household reproduction number $\mathcal{R}_H$ is defined as the expected number of households a typical infectious household infects. B. Variation of the effective household size $\eta^{*}\xspace$ across European countries. C. Methods overview: Using age-dependent infection fatality rates (IFRs) and country demographics, we obtained country-specific IFR samples (CIFRs). Combined with an estimate for the number of COVID-19 deaths, we estimated the COVID-19 prevalence (cumulative incidence) for each country. Finally, combining country-specific prevalences, household size distributions, and in-household secondary attack rates, we computed the out-household reproduction number $\bar{R}_{\rm out}\xspace$.
Figure 2: Differences in the effective household size can explain about 40% of the variation in COVID-19 prevalence across European countries in the first pandemic year.A. We analyzed a hypothetical scenario where all European countries had the same out-household spread (computed by treating Europe as a single country), namely, the same conditions (NPIs, climate, social culture) except for their household size distribution. By doing so, we calculated theoretical prevalences (gray crosses and black dashed line) and interpreted deviations from them as differences in the effective reduction of out-household contagion across countries (B). Besides the strong correlation between observed prevalence and effective household size (C), the latter explains 40.6% (95% CI: [14.7%, 46.1%]) of the variance observed in the former (D). Countries are color-coded as follows: Nordic countries (blue), Western European countries (green), and post-communist countries of Central and Eastern Europe (red), and island countries (yellow). Note that, following the method in Fig. \ref{['fig:methods_merged1-2']}C, the result is a distribution of theoretical prevalence, which is not shown in this figure for clarity. A larger version with country labels is available in the Supplementary Section \ref{['fig:supplementary-mainresults_withlabels']}.
Figure 3: Out-household COVID-19 spread across European countries.A, B. Using our prevalence estimations and country-specific infection fatality rates (CIFR), we computed the $\bar{R}_{\rm out}\xspace$ distributions (A) and their deviations from the European mean (B) (countries ranked in order of increasing median) for the timeframe between January 1st, 2020, and June 13th, 2021. C. As positions in the ranking for different CIFR vectors are correlated, we obtained a distribution over the possible ranking spots for each country. Vertical lines represent the median and 95% CIs, and color bars represent the probability density or mass.
Figure 4: Effective household size $\eta^{*}$ and the out-household rep. number $\bar{R}_{\rm out}$ influence the correlation between COVID-19 prevalence $\alpha$ and the Human Development Index (HDI). We used the semi-partial correlation to quantify the relative change in the correlation (Pearson vs. semi-partial) between $\alpha$ and HDI when removing the influence of $\eta^{*}$ (A) and $\bar{R}_{\rm out}$ (B) from $\alpha$.
Figure S1: Country infection fatality rate (CIFR) estimation for different countries. We compute country-specific IFRs using age-resolved estimates of the IFR brazeau2020reportbrazeau2022estimating and country demographics. Given that age-resolved IFRs are reported as a distribution, we draw 50,000 samples and fit an exponential function to capture the correlation between measurements. We use the fitted profile as an estimated CIFR and weight it by country demographics to obtain a country-specific IFR (color scheme). The vertical bars extend until the 10th and 90th percentiles, and the black bar indicates the median. Europe marked with a double asterisk (**) includes only countries for which we have full Eurostat data.
...and 7 more figures

Household size can explain 40% of the variance in cumulative COVID-19 incidence across Europe

TL;DR

Abstract

Household size can explain 40% of the variance in cumulative COVID-19 incidence across Europe

Authors

TL;DR

Abstract

Table of Contents

Figures (12)