Parameter estimation of epidemic spread in two-layer random graphs by classical and machine learning methods
Ágnes Backhausz, Edit Bognár, Villő Csiszár, Damján Tárkányi, András Zempléni
TL;DR
This study addresses the problem of estimating the infection rate parameter $\tau$ in epidemic spread on two-layer random graphs by comparing classical maximum-likelihood estimation with XGBoost and CNN approaches. It uses SIR dynamics simulated via the Gillespie algorithm on graphs comprising a households layer and a second layer that is either scale-free or clique-based, with edge weight $w$ and recovery rate $\gamma=1$. The work contributes a detailed comparison of estimation accuracy across epidemic phases, analyzes the impact of training/test graph structure and additional features, and provides practical guidance on when to favor ML methods over classical estimators. Overall, XGBoost offers the strongest performance, CNN provides robustness at a higher computational cost, and ML methods particularly excel when structural information is incomplete.
Abstract
Our main goal in this paper is to quantitatively compare the performance of classical methods to XGBoost and convolutional neural networks in a parameter estimation problem for epidemic spread. As we use flexible two-layer random graphs as the underlying network, we can also study how much the structure of the graphs in the training set and the test set can differ while to get a reasonably good estimate. In addition, we also examine whether additional information (such as the average degree of infected vertices) can help improving the results, compared to the case when we only know the time series consisting of the number of susceptible and infected individuals. Our simulation results also show which methods are most accurate in the different phases of the epidemic.
