Table of Contents
Fetching ...

Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li

TL;DR

This work proposes a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer that consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction.

Abstract

Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.

Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

TL;DR

This work proposes a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer that consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction.

Abstract

Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.
Paper Structure (32 sections, 9 equations, 16 figures, 14 tables, 3 algorithms)

This paper contains 32 sections, 9 equations, 16 figures, 14 tables, 3 algorithms.

Figures (16)

  • Figure 1: An overview of the proposed framework. (a) A collection of optimized spatio-temporal prediction models based on the dataset of source cities; each model's parameters are transformed into a vector-based format. (b) Pre-training the diffusion model to generate neural network parameters from the noise given the prompt. (c) Utilizing the pre-trained diffusion model to generate neural network parameters for the target city based on the target prompt.
  • Figure 2: The network architecture of the denoising network. (b) and (c) illustrate two conditioning strategies, and we provide other used conditioning strategists in Appendix \ref{['sup:cond']}.
  • Figure 3: Performance comparison across different conditioning strategies.
  • Figure 4: Performance comparison of different prompts with Washington D.C. as the target city.
  • Figure 5: Comparison of data-space and parameter-space knowledge across cities. Colorful points represent divided regions in the city.
  • ...and 11 more figures

Theorems & Definitions (1)

  • Definition 1: Spatio-Temporal Graph