Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict
Daniel Mittermaier, Tobias Bohne, Martin Hofer, Daniel Racek
TL;DR
This work tackles forecasting fatalities from armed conflict at fine spatial-temporal resolution by shifting from point predictions to predictive distributions. It implements a modular AutoML pipeline built from tree-based distributional classifiers and regressors in a hurdle framework, integrated with regional (local) ensembles to handle data heterogeneity. Across six test windows, the approach generally outperforms history-based benchmarks on distributional metrics, with insights into how zero-inflation shapes evaluation and the value of regional data integration. The study highlights practical pathways to actionable uncertainty estimates for early-warning systems and outlines directions for incorporating broader data sources and selective prediction strategies.
Abstract
Predictions of fatalities from violent conflict on the PRIO-GRID-month (pgm) level are characterized by high levels of uncertainty, limiting their usefulness in practical applications. We discuss the two main sources of uncertainty for this prediction task, the nature of violent conflict and data limitations, embedding this in the wider literature on uncertainty quantification in machine learning. We develop a strategy to quantify uncertainty in conflict forecasting, shifting from traditional point predictions to full predictive distributions. Our approach compares and combines multiple tree-based classifiers and distributional regressors in a custom auto-ML setup, estimating distributions for each pgm individually. We also test the integration of regional models in spatial ensembles as a potential avenue to reduce uncertainty. The models are able to consistently outperform a suite of benchmarks derived from conflict history in predictions up to one year in advance, with performance driven by regions where conflict was observed. With our evaluation, we emphasize the need to understand how a metric behaves for a given prediction problem, in our case characterized by extremely high zero-inflatedness. While not resulting in better predictions, the integration of smaller models does not decrease performance for this prediction task, opening avenues to integrate data sources with less spatial coverage in the future.
