SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

Maeghal Jain; Ziya Uddin; Wubshet Ibrahim

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

Maeghal Jain, Ziya Uddin, Wubshet Ibrahim

TL;DR

This work addresses balancing public health and economic stability during pandemics in emerging markets and developing economies by formulating an SIR-based environment that incorporates lockdown via a stringency index and vaccination. It combines time-varying epidemic dynamics with a cubic GDP relationship and trains a reinforcement learning agent (using LSTM-enhanced networks) to optimize policy actions that modulate stringency while considering health and economic rewards. Key contributions include (i) a sequence of SIR extensions with lockdown and time-varying vaccination, (ii) a data-driven GDP–stringency link, and (iii) a deep RL framework with a tunable reward that yields policy trajectories capable of keeping the effective reproduction number $R_e$ under target thresholds while mitigating GDP loss. The findings suggest that time-varying vaccination and RL-guided stringency strategies can improve outcomes over static policies, offering a transparent and adaptable tool for policymakers in EMDE contexts, though the approach currently relies on a deterministic model and would benefit from stochastic extensions and broader decision factors in future work.

Abstract

The outbreak of COVID-19 has highlighted the intricate interplay between public health and economic stability on a global scale. This study proposes a novel reinforcement learning framework designed to optimize health and economic outcomes during pandemics. The framework leverages the SIR model, integrating both lockdown measures (via a stringency index) and vaccination strategies to simulate disease dynamics. The stringency index, indicative of the severity of lockdown measures, influences both the spread of the disease and the economic health of a country. Developing nations, which bear a disproportionate economic burden under stringent lockdowns, are the primary focus of our study. By implementing reinforcement learning, we aim to optimize governmental responses and strike a balance between the competing costs associated with public health and economic stability. This approach also enhances transparency in governmental decision-making by establishing a well-defined reward function for the reinforcement learning agent. In essence, this study introduces an innovative and ethical strategy to navigate the challenge of balancing public health and economic stability amidst infectious disease outbreaks.

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

TL;DR

under target thresholds while mitigating GDP loss. The findings suggest that time-varying vaccination and RL-guided stringency strategies can improve outcomes over static policies, offering a transparent and adaptable tool for policymakers in EMDE contexts, though the approach currently relies on a deterministic model and would benefit from stochastic extensions and broader decision factors in future work.

Abstract

Paper Structure (17 sections, 45 equations, 14 figures)

This paper contains 17 sections, 45 equations, 14 figures.

Introduction
Mathematical Formulation and Numerical Computation
Simple SIR Model
SIR Model with Lockdown
SIR Model with Lockdown and Vaccination
Optimizing Window Length for Time-varying Vaccination Rate
SIR Model with Lockdown and Time-varying Vaccination Rate
Modelling Normalized GDP with Stringency
Reinforcement Learning
Defining the Reward Function
Deep Reinforcement Learning and Training
Results
Discussion
Experiment Settings
Dataset
...and 2 more sections

Figures (14)

Figure 1: Loss for Different Window Lengths. We try different window lengths to find the optimal loss for both cases, either when predicting all three populations (susceptibles, infected, recovered) or just the infected population.
Figure 2: $\nu$ Varying with Time. This depicts how the vaccination rate ($\nu$) changes over time and highlights the introduction of the vaccination campaign in India.
Figure 3: Deep Reinforcement Learning. Deep learning algorithms used in reinforcement learning enables more complex decision-making.
Figure 4: SIR Model Comparison for India. The figure presents a comparison between the fitted simple SIR model (eq:S_without_lockdowneq:cost_I_without_lockdown) and real data. Here, an evident overestimation of the infected population is observed.
Figure 5: SIR Model with Lockdown Analysis for India. This figure illustrates the fitting of the SIR model with lockdown (eq:S_with_lockdowneq:cost_I_with_lockdown) in comparison to real data. The introduction of lockdown measures showcases discernible effects on the dynamics of disease progression. While an overestimation persists, the model's peaks now closely align with the observed data and is able to capture key trends.
...and 9 more figures

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

TL;DR

Abstract

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

Authors

TL;DR

Abstract

Table of Contents

Figures (14)