Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top exploration
Alexandra Cimpean, Timothy Verstraeten, Lander Willem, Niel Hens, Ann Nowé, Pieter Libin
TL;DR
The paper tackles the problem of efficiently evaluating fine-grained COVID-19 vaccine allocation policies under substantial computational cost and decision uncertainty. It introduces a Bayesian anytime $m$-top exploration framework, implemented as epidemic bandits, to identify the top $m$ vaccination strategies with quantified uncertainty. Through ground-truth validation in a Belgian STRIDE-based setting, the authors show how different contact-reduction regimes and vaccine-uptake levels shape the prioritization of age groups and vaccine types, and they provide a GPL-licensed software framework for replication and future use. The work offers policymakers a compact set of high-utility strategies with uncertainty bounds, enabling robust, data-informed decisions during evolving epidemics.
Abstract
Individual-based epidemiological models support the study of fine-grained preventive measures, such as tailored vaccine allocation policies, in silico. As individual-based models are computationally intensive, it is pivotal to identify optimal strategies within a reasonable computational budget. Moreover, due to the high societal impact associated with the implementation of preventive strategies, uncertainty regarding decisions should be communicated to policy makers, which is naturally embedded in a Bayesian approach. We present a novel technique for evaluating vaccine allocation strategies using a multi-armed bandit framework in combination with a Bayesian anytime $m$-top exploration algorithm. $m$-top exploration allows the algorithm to learn $m$ policies for which it expects the highest utility, enabling experts to inspect this small set of alternative strategies, along with their quantified uncertainty. The anytime component provides policy advisors with flexibility regarding the computation time and the desired confidence, which is important as it is difficult to make this trade-off beforehand. We consider the Belgian COVID-19 epidemic using the individual-based model STRIDE, where we learn a set of vaccination policies that minimize the number of infections and hospitalisations. Through experiments we show that our method can efficiently identify the $m$-top policies, which is validated in a scenario where the ground truth is available. Finally, we explore how vaccination policies can best be organised under different contact reduction schemes and we investigate the impact of vaccine uptake proportions (i.e., the proportion of individuals that will comply with the strategy and take the vaccine).
