Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling

Ivan Marisca; Cesare Alippi; Filippo Maria Bianchi

Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling

Ivan Marisca, Cesare Alippi, Filippo Maria Bianchi

TL;DR

The work tackles spatiotemporal forecasting over sensor networks in the presence of missing data by introducing hierarchical spatiotemporal downsampling that learns multi-scale representations. Representations from different temporal and spatial scales are adaptively weighted by an interpretable attention mechanism conditioned on data and missingness patterns, enabling dynamic receptive-field expansion without imputing missing values. Empirical results on synthetic and real-world benchmarks show strong accuracy and scalability, with notable gains when missing blocks occur, and the framework offers interpretable decoder weights that reveal scale usage. This approach advances robust forecasting in non-ideal sensing environments and provides a flexible, end-to-end alternative to traditional imputation-first pipelines.

Abstract

Given a set of synchronous time series, each associated with a sensor-point in space and characterized by inter-series relationships, the problem of spatiotemporal forecasting consists of predicting future observations for each point. Spatiotemporal graph neural networks achieve striking results by representing the relationships across time series as a graph. Nonetheless, most existing methods rely on the often unrealistic assumption that inputs are always available and fail to capture hidden spatiotemporal dynamics when part of the data is missing. In this work, we tackle this problem through hierarchical spatiotemporal downsampling. The input time series are progressively coarsened over time and space, obtaining a pool of representations that capture heterogeneous temporal and spatial dynamics. Conditioned on observations and missing data patterns, such representations are combined by an interpretable attention mechanism to generate the forecasts. Our approach outperforms state-of-the-art methods on synthetic and real-world benchmarks under different missing data distributions, particularly in the presence of contiguous blocks of missing values.

Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling

TL;DR

Abstract

Paper Structure (39 sections, 23 equations, 9 figures, 6 tables)

This paper contains 39 sections, 23 equations, 9 figures, 6 tables.

Introduction
Preliminaries and Problem Formulation
Spatiotemporal Message Passing
Spatiotemporal Downsampling
Spatiotemporal Missing Data Distributions
Proposed Architecture
Input encoder
Temporal processing
Spatial processing
Decoder
Implementation Details
Analogy with filterbanks
Scalability
Related Work
Experiments
...and 24 more sections

Figures (9)

Figure 1: Overview of the proposed framework. The hierarchical design allows us to learn a pool of multi-scale spatiotemporal representations. Conditioned on the data and the missing value pattern, the attention mechanism dynamically combines representation from different scales to compute the predictions.
Figure 2: Overview of the proposed architecture. Given input data ${\mathcal{G}}_{t-W:t}$, all information associated with every $i$-th node and time step $t$ is encoded in vectors ${{\bm{h}}^{i{\left\langle 0 \right\rangle}}_{t{\left\langle 0 \right\rangle}}}$, then processed node-wise along the temporal dimension by alternating and downsampling. After each $l$-th layer, the sequences are combined in a single vector ${\bm{z}}^{i{\left\langle 0 \right\rangle}}_{t{\left\langle l \right\rangle}}$, which is then processed along the spatial dimension by alternating and pooling. Representations at each $k$-th pooling layer are then recursively un-pooled up to the initial node level, obtaining ${{\bm{z}}^{i{\left\langle k \right\rangle}}_{t{\left\langle 1: \right\rangle}}}$. Finally, the $L(K+1)$ encodings ${{\bm{z}}^{i{\left\langle 0: \right\rangle}}_{t{\left\langle 1: \right\rangle}}}$ are combined through an attention mechanism and fed to an mlp to obtain the predictions.
Figure 3: Details of the spatial processing procedure.
Figure 4: Decoder weights in (). The graph used to produce this plot is the undirected line graph shown close to the scores associated with the first spatial resolution $k=0$. Node colors in the graphs associated with higher spatial scales show how nodes are clustered in supernodes by ${\bm{S}}_k$ at each scale $k$.
Figure 5: Schematic depiction of the generation of the dataset.
...and 4 more figures

Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling

TL;DR

Abstract

Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling

Authors

TL;DR

Abstract

Table of Contents

Figures (9)