Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention
Kyle Heuton, F. Samuel Muench, Shikhar Shrestha, Thomas J. Stopka, Michael C. Hughes
TL;DR
This paper tackles allocating scarce intervention resources across many spatial sites using spatiotemporal forecasts by introducing a decision-centric metric, the fraction of best possible reach (BPR). It develops a ratio-estimator ranking to produce top-$K$ site selections and a training framework called decision-aware maximum likelihood (DAML) that balances predictive likelihood with a BPR constraint, using perturbed optimizers to enable gradient-based learning through discrete top-$K$ decisions. The authors demonstrate that standard maximum likelihood can be suboptimal for decision quality, while DAML and direct BPR optimization improve top-$K$ decisions with varying effects on forecast likelihood, across synthetic data and real-world applications in opioid overdose forecasting and wildlife monitoring. Collectively, the work provides practical methods and theoretical justification for ranking and training spatiotemporal models to optimize top-$K$ interventions with significant implications for public health and conservation planning.
Abstract
Optimal allocation of scarce resources is a common problem for decision makers faced with choosing a limited number of locations for intervention. Spatiotemporal prediction models could make such decisions data-driven. A recent performance metric called fraction of best possible reach (BPR) measures the impact of using a model's recommended size K subset of sites compared to the best possible top-K in hindsight. We tackle two open problems related to BPR. First, we explore how to rank all sites numerically given a probabilistic model that predicts event counts jointly across sites. Ranking via the per-site mean is suboptimal for BPR. Instead, we offer a better ranking for BPR backed by decision theory. Second, we explore how to train a probabilistic model's parameters to maximize BPR. Discrete selection of K sites implies all-zero parameter gradients which prevent standard gradient training. We overcome this barrier via advances in perturbed optimizers. We further suggest a training objective that combines likelihood with a decision-aware BPR constraint to deliver high-quality top-K rankings as well as good forecasts for all sites. We demonstrate our approach on two where-to-intervene applications: mitigating opioid-related fatal overdoses for public health and monitoring endangered wildlife.
