Table of Contents
Fetching ...

Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict

Daniel Mittermaier, Tobias Bohne, Martin Hofer, Daniel Racek

TL;DR

This work tackles forecasting fatalities from armed conflict at fine spatial-temporal resolution by shifting from point predictions to predictive distributions. It implements a modular AutoML pipeline built from tree-based distributional classifiers and regressors in a hurdle framework, integrated with regional (local) ensembles to handle data heterogeneity. Across six test windows, the approach generally outperforms history-based benchmarks on distributional metrics, with insights into how zero-inflation shapes evaluation and the value of regional data integration. The study highlights practical pathways to actionable uncertainty estimates for early-warning systems and outlines directions for incorporating broader data sources and selective prediction strategies.

Abstract

Predictions of fatalities from violent conflict on the PRIO-GRID-month (pgm) level are characterized by high levels of uncertainty, limiting their usefulness in practical applications. We discuss the two main sources of uncertainty for this prediction task, the nature of violent conflict and data limitations, embedding this in the wider literature on uncertainty quantification in machine learning. We develop a strategy to quantify uncertainty in conflict forecasting, shifting from traditional point predictions to full predictive distributions. Our approach compares and combines multiple tree-based classifiers and distributional regressors in a custom auto-ML setup, estimating distributions for each pgm individually. We also test the integration of regional models in spatial ensembles as a potential avenue to reduce uncertainty. The models are able to consistently outperform a suite of benchmarks derived from conflict history in predictions up to one year in advance, with performance driven by regions where conflict was observed. With our evaluation, we emphasize the need to understand how a metric behaves for a given prediction problem, in our case characterized by extremely high zero-inflatedness. While not resulting in better predictions, the integration of smaller models does not decrease performance for this prediction task, opening avenues to integrate data sources with less spatial coverage in the future.

Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict

TL;DR

This work tackles forecasting fatalities from armed conflict at fine spatial-temporal resolution by shifting from point predictions to predictive distributions. It implements a modular AutoML pipeline built from tree-based distributional classifiers and regressors in a hurdle framework, integrated with regional (local) ensembles to handle data heterogeneity. Across six test windows, the approach generally outperforms history-based benchmarks on distributional metrics, with insights into how zero-inflation shapes evaluation and the value of regional data integration. The study highlights practical pathways to actionable uncertainty estimates for early-warning systems and outlines directions for incorporating broader data sources and selective prediction strategies.

Abstract

Predictions of fatalities from violent conflict on the PRIO-GRID-month (pgm) level are characterized by high levels of uncertainty, limiting their usefulness in practical applications. We discuss the two main sources of uncertainty for this prediction task, the nature of violent conflict and data limitations, embedding this in the wider literature on uncertainty quantification in machine learning. We develop a strategy to quantify uncertainty in conflict forecasting, shifting from traditional point predictions to full predictive distributions. Our approach compares and combines multiple tree-based classifiers and distributional regressors in a custom auto-ML setup, estimating distributions for each pgm individually. We also test the integration of regional models in spatial ensembles as a potential avenue to reduce uncertainty. The models are able to consistently outperform a suite of benchmarks derived from conflict history in predictions up to one year in advance, with performance driven by regions where conflict was observed. With our evaluation, we emphasize the need to understand how a metric behaves for a given prediction problem, in our case characterized by extremely high zero-inflatedness. While not resulting in better predictions, the integration of smaller models does not decrease performance for this prediction task, opening avenues to integrate data sources with less spatial coverage in the future.

Paper Structure

This paper contains 17 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Visualization of the creation process of the clusters for local models. a) clusters created by manually tuned HDBSCAN algorithm with corresponding polygons; b) clusters after merging of smaller clusters with updated polygons; c) final clusters with grid-cells without violence assigned. Grid cells shown in a) and b) are those experiencing any violence in the training data (1990-2017). Black grid cells displayed in a) and b) are not assigned to any clusters by HDBSCAN initially.
  • Figure 2: Prediction intervals for example grid cells based on 2018 predictions from the global model, compared to the observed number of fatalities and a maximum a posteriori (MAP) estimate for the predictive distributions. MAP estimates are calculated as the maximum of the probability density function estimated via Gaussian kernel density estimation. 25%, 50%, 75%, 90%, and 95% prediction intervals are only drawn if upper boundaries are not zero. Figures 2b) and 2c) also include the extreme boundaries (min/max) of the corresponding samples.
  • Figure 3: Value ranges of CRPS scores for one year of simulated actuals and predictions, with varying noise and “accuracy”. Actuals are simulated with zero-inflatedness matching the training data, and values drawn from an estimated PDF based on non-zero observations < 1000 fatalities. Predictions are drawn from a poisson distribution with mean and variance equal to the corresponding actual. “Accuracy” ($\alpha$) less than 1 means a share of non-zero samples was replaced with all-zero predictions. Noise ($n$) means actuals were shifted by $50n\varepsilon$before creating the samples, with $\varepsilon$ randomly drawn from a uniform distribution between -1 and 1.
  • Figure 4: Comparison of model-wide scores with scores based on grid cells grouped by individual countries for the three distributional metrics: CRPS (a), MIS (b), and IGN (c).