A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

Yakun Chen; Kaize Shi; Zhangkai Wu; Juan Chen; Xianzhi Wang; Julian McAuley; Guandong Xu; Shui Yu

A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

Yakun Chen, Kaize Shi, Zhangkai Wu, Juan Chen, Xianzhi Wang, Julian McAuley, Guandong Xu, Shui Yu

TL;DR

This work tackles spatiotemporal imputation under non-stationarity by introducing C$^2$TSD, a conditional diffusion model that disentangles temporal structure into trend and seasonality and augments learning with contrastive representation. By conditioning the reverse diffusion on disentangled temporal features and spatial context from a Graph Neural Network, and by employing temporal and spatial attention in the noise predictor, the method achieves accurate and probabilistic imputations while mitigating error accumulation typical of recurrent approaches. Comprehensive experiments on AQI-36, PEMS-BAY, and METR-LA demonstrate consistent improvements over strong baselines, with ablations confirming the critical roles of contrastive learning and temporal disentanglement. The proposed approach advances practical spatiotemporal imputation by delivering robust generalization across unseen distributions and missing patterns, at the cost of higher computational demand inherent to diffusion models.

Abstract

Spatiotemporal data analysis is pivotal across various domains, such as transportation, meteorology, and healthcare. The data collected in real-world scenarios are often incomplete due to device malfunctions and network errors. Spatiotemporal imputation aims to predict missing values by exploiting the spatial and temporal dependencies in the observed data. Traditional imputation approaches based on statistical and machine learning techniques require the data to conform to their distributional assumptions, while graph and recurrent neural networks are prone to error accumulation problems due to their recurrent structures. Generative models, especially diffusion models, can potentially circumvent the reliance on inaccurate, previously imputed values for future predictions; However, diffusion models still face challenges in generating stable results. We propose to address these challenges by designing conditional information to guide the generative process and expedite the training process. We introduce a conditional diffusion framework called C$^2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information and employs contrastive learning to improve generalizability. Our extensive experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.

A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

TL;DR

This work tackles spatiotemporal imputation under non-stationarity by introducing C

TSD, a conditional diffusion model that disentangles temporal structure into trend and seasonality and augments learning with contrastive representation. By conditioning the reverse diffusion on disentangled temporal features and spatial context from a Graph Neural Network, and by employing temporal and spatial attention in the noise predictor, the method achieves accurate and probabilistic imputations while mitigating error accumulation typical of recurrent approaches. Comprehensive experiments on AQI-36, PEMS-BAY, and METR-LA demonstrate consistent improvements over strong baselines, with ablations confirming the critical roles of contrastive learning and temporal disentanglement. The proposed approach advances practical spatiotemporal imputation by delivering robust generalization across unseen distributions and missing patterns, at the cost of higher computational demand inherent to diffusion models.

Abstract

TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information and employs contrastive learning to improve generalizability. Our extensive experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.

Paper Structure (27 sections, 14 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 27 sections, 14 equations, 5 figures, 3 tables, 2 algorithms.

Introduction
Approach
Problem Statement
Approach Overview
Training Process
Imputation Process
Conditional Information Construction Module
Noise Prediction Model
Contrastive Learning Strategy
Experiments
Datasets
Baselines
Evaluation Metrics
Experimental Design
Datasets Split
...and 12 more sections

Figures (5)

Figure 1: An illustrative example of trend, seasonality, and noise components (the top three boxes) that constitute a time series (at the bottom).
Figure 2: Architecture of C$^2$TSD. Our framework follows the pipeline of denoising diffusion probabilistic models. In particular, C$^2$TSD uses a trained noise prediction model $\epsilon_\theta$ to sample $\widetilde{\mathbf{X}}^{t-1}$ step by step in the reverse process, under the guidance of the conditional information $\mathbf{C}^{\mathit{Con}}$ generated from interpolated observed data $\mathcal{X}$ and geographical information $\mathbf{G}$. It also uses a contrastive loss to supplement the reconstruction loss of the diffusion model.
Figure 3: Overview conditional information construction and noise prediction. The Conditional Information Construction Module (Section \ref{['module:conditional']}) processes observed values to learn the conditional representation $\mathbf{C}^{\mathit{Con}}$, which will later be used to guide learning spatiotemporal dependencies in the reverse process. The Noise Prediction Model (Section \ref{['module:noise']}) takes the noisy information $\textbf{H}^{in}$, the conditional feature $\mathbf{C}^{\mathit{Con}}$, and the adjacency matrix $\mathbf{A}$ as the input to convert noisy information into spatiotemporal data with imputed values. Additionally, we apply contrastive learning (Section \ref{['contrastive']}) to help learn discriminative features and generalize the model.
Figure 4: The architecture of Trend Feature Extraction and Seasonal Feature Extraction. The Temporal Feature Extraction (the upper box) consists of a stack of causal convolution layers with different kernel sizes. The Seasonal Feature Extraction (the lower box) is implemented by FFT and iFFT layers.
Figure 5: Sensitivity study of key hyperparameters.

A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

TL;DR

Abstract

A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation

Authors

TL;DR

Abstract

Table of Contents

Figures (5)