Table of Contents
Fetching ...

GeOT: A spatially explicit framework for evaluating spatio-temporal predictions

Nina Wiedemann, Théo Uscidda, Martin Raubal

TL;DR

GeOT presents a spatially explicit framework for evaluating spatio-temporal predictions by leveraging Optimal Transport to quantify relocation costs between predicted and ground-truth spatial distributions. By defining a cost matrix over locations, GeOT computes $W^{geo}_c$ (and its partial version $W^{geo}_{c,\phi}$) to reflect real-world operational costs, and shows how Sinkhorn-based OT losses can be used to train models with spatial awareness. Validation on synthetic data links OT to spatial autocorrelation measures like Moran's I, while case studies on bike sharing, charging, and traffic demonstrate improved alignment of predictions with spatially distributed costs and reveal scale- and application-dependent trade-offs. The framework offers a flexible, interpretable metric that can guide model selection, aggregation level, and loss design, with code available for reproducibility. Overall, GeOT advances GeoAI by integrating spatial cost considerations directly into evaluation and training, enabling more cost-aware and spatially coherent predictions.

Abstract

When predicting observations across space and time, the spatial layout of errors impacts a model's real-world utility. For instance, in bike sharing demand prediction, error patterns translate to relocation costs. However, commonly used error metrics in GeoAI evaluate predictions point-wise, neglecting effects such as spatial heterogeneity, autocorrelation, and the Modifiable Areal Unit Problem. We put forward Optimal Transport (OT) as a spatial evaluation metric and loss function. The proposed framework, called GeOT, assesses the performance of prediction models by quantifying the transport costs associated with their prediction errors. Through experiments on real and synthetic data, we demonstrate that 1) the spatial distribution of prediction errors relates to real-world costs in many applications, 2) OT captures these spatial costs more accurately than existing metrics, and 3) OT enhances comparability across spatial and temporal scales. Finally, we advocate for leveraging OT as a loss function in neural networks to improve the spatial accuracy of predictions. Experiments with bike sharing, charging station, and traffic datasets show that spatial costs are significantly reduced with only marginal changes to non-spatial error metrics. Thus, this approach not only offers a spatially explicit tool for model evaluation and selection, but also integrates spatial considerations into model training. All code is available at https://github.com/mie-lab/geospatialOT.

GeOT: A spatially explicit framework for evaluating spatio-temporal predictions

TL;DR

GeOT presents a spatially explicit framework for evaluating spatio-temporal predictions by leveraging Optimal Transport to quantify relocation costs between predicted and ground-truth spatial distributions. By defining a cost matrix over locations, GeOT computes (and its partial version ) to reflect real-world operational costs, and shows how Sinkhorn-based OT losses can be used to train models with spatial awareness. Validation on synthetic data links OT to spatial autocorrelation measures like Moran's I, while case studies on bike sharing, charging, and traffic demonstrate improved alignment of predictions with spatially distributed costs and reveal scale- and application-dependent trade-offs. The framework offers a flexible, interpretable metric that can guide model selection, aggregation level, and loss design, with code available for reproducibility. Overall, GeOT advances GeoAI by integrating spatial cost considerations directly into evaluation and training, enabling more cost-aware and spatially coherent predictions.

Abstract

When predicting observations across space and time, the spatial layout of errors impacts a model's real-world utility. For instance, in bike sharing demand prediction, error patterns translate to relocation costs. However, commonly used error metrics in GeoAI evaluate predictions point-wise, neglecting effects such as spatial heterogeneity, autocorrelation, and the Modifiable Areal Unit Problem. We put forward Optimal Transport (OT) as a spatial evaluation metric and loss function. The proposed framework, called GeOT, assesses the performance of prediction models by quantifying the transport costs associated with their prediction errors. Through experiments on real and synthetic data, we demonstrate that 1) the spatial distribution of prediction errors relates to real-world costs in many applications, 2) OT captures these spatial costs more accurately than existing metrics, and 3) OT enhances comparability across spatial and temporal scales. Finally, we advocate for leveraging OT as a loss function in neural networks to improve the spatial accuracy of predictions. Experiments with bike sharing, charging station, and traffic datasets show that spatial costs are significantly reduced with only marginal changes to non-spatial error metrics. Thus, this approach not only offers a spatially explicit tool for model evaluation and selection, but also integrates spatial considerations into model training. All code is available at https://github.com/mie-lab/geospatialOT.

Paper Structure

This paper contains 35 sections, 16 equations, 14 figures, 4 tables, 1 algorithm.

Figures (14)

  • Figure 1: Optimal transport as an evaluation framework in geospatial data science. Spatio-temporal prediction problems involve forecasting spatial observations over time, such as estimating bike-sharing demand at multiple stations (left). Conventional metrics usually treat locations independently, ignoring their spatial distribution (blue). In contrast, our GeOT framework based on Optimal Transport accounts for spatial costs, quantifying prediction errors in terms of the effort required for relocation or resource allocation, such as relocating bicycles between stations (green).
  • Figure 2: Quantifying spatial costs with Optimal Transport. Given a cost matrix $C$ defined between location pairs, prediction errors are measured in terms of the minimal transport costs required to align the predictions with the true observations (see \ref{['sec:methods_eval']}). In the example, a mass of 90 must be transported from location 1 to location 3 with cost 5, leading to an OT error of 450.
  • Figure 3: Comparison of MSE and OT error on synthetic data with increasingly unbalanced residuals ($\mu=0$: no imbalance, $\mu=1.5$: strong spatial imbalance). The imbalanced residuals lead to larger spatial costs, which is evident from the increasing OT error. There is also an evident relation of OT to spatial autocorrelation.
  • Figure 4: Relation between the OT error and the spatial autocorrelation of the residuals, measured with Moran's I. Empirically, the OT error strongly correlates with Moran's I (\ref{['fig:ot_vs_moransi']}). Analytically we find that it combines standard error measures (A) with the ability to reflect spatial imbalance (B) and their distances (D). Since the OT error is computed as the minimal redistribution costs, it puts less focus on the similarity of neighboring points than Moran's I (C).
  • Figure 5: Transport map as computed with the GeOT framework. The goodness of the prediction is measured in terms of the relocation costs necessary to align the predictions with the real observations. Here, the difference between real and predicted bike sharing demand is shown, where mass is transported from bike sharing stations with overestimated demand (orange) to stations where the demand was underestimated (purple). In the example, the total spatial costs are rather low since most errors are balanced out with nearby points.
  • ...and 9 more figures