Hybrid Predictive Modeling of Malaria Incidence in the Amhara Region, Ethiopia: Integrating Multi-Output Regression and Time-Series Forecasting
Kassahun Azezew, Amsalu Tesema, Bitew Mekuria, Ayenew Kassie, Animut Embiale, Ayodeji Olalekan Salau, Tsega Asresa
TL;DR
The paper addresses the challenge of forecasting malaria incidence in the Amhara Region by proposing a hybrid forecasting framework that combines time-series forecasting with ensemble multi-output regression to predict $Pf$, $Pv$, and total malaria. It leverages environmental, demographic, and historical incidence data, applying feature engineering and rigorous validation (chronological splits, cross-validation, grid search) across Random Forest, Gradient Boosting, and AdaBoost multi-output models. Results show the hybrid approach offers improved accuracy over single-output baselines and delivers simultaneous predictions for species-specific and spatial-temporal malaria patterns. This framework has practical implications for targeted interventions and resource planning in endemic settings, and it contributes to methodological advances in epidemiological forecasting by integrating multi-output and time-series components.
Abstract
Malaria remains a major public health concern in Ethiopia, particularly in the Amhara Region, where seasonal and unpredictable transmission patterns make prevention and control challenging. Accurately forecasting malaria outbreaks is essential for effective resource allocation and timely interventions. This study proposes a hybrid predictive modeling framework that combines time-series forecasting, multi-output regression, and conventional regression-based prediction to forecast the incidence of malaria. Environmental variables, past malaria case data, and demographic information from Amhara Region health centers were used to train and validate the models. The multi-output regression approach enables the simultaneous prediction of multiple outcomes, including Plasmodium species-specific cases, temporal trends, and spatial variations, whereas the hybrid framework captures both seasonal patterns and correlations among predictors. The proposed model exhibits higher prediction accuracy than single-method approaches, exposing hidden patterns and providing valuable information to public health authorities. This study provides a valid and repeatable malaria incidence prediction framework that can support evidence-based decision-making, targeted interventions, and resource optimization in endemic areas.
