Table of Contents
Fetching ...

Damba-ST: Domain-Adaptive Mamba for Efficient Urban Spatio-Temporal Prediction

Rui An, Yifeng Zhang, Ziran Liang, Wenqi Fan, Yuxuan Liang, Xuequn Shang, Qing Li

TL;DR

This work tackles cross-city urban spatio-temporal forecasting under domain heterogeneity and Transformer-era computational constraints. It introduces Damba-ST, a Domain-Adaptive Mamba backbone built around a Domain-Adaptive State Space Model (DASSM) with Domain Adapters, and three complementary views (Spatial, Temporal, ST-delay) processed via Intra-Domain Scanning and Cross-Domain Adaptation. The model learns domain-specific discriminative patterns while aligning cross-domain commonalities, achieving linear complexity in sequence length and strong zero-shot generalization across regions, cities, and tasks. Empirical results show state-of-the-art performance and practical deployment benefits, including fast inference on real-world data, enabling robust, plug-and-play urban forecasting without extensive retraining.

Abstract

Training urban spatio-temporal foundation models that generalize well across diverse regions and cities is critical for deploying urban services in unseen or data-scarce regions. Recent studies have typically focused on fusing cross-domain spatio-temporal data to train unified Transformer-based models. However, these models suffer from quadratic computational complexity and high memory overhead, limiting their scalability and practical deployment. Inspired by the efficiency of Mamba, a state space model with linear time complexity, we explore its potential for efficient urban spatio-temporal prediction. However, directly applying Mamba as a spatio-temporal backbone leads to negative transfer and severe performance degradation. This is primarily due to spatio-temporal heterogeneity and the recursive mechanism of Mamba's hidden state updates, which limit cross-domain generalization. To overcome these challenges, we propose Damba-ST, a novel domain-adaptive Mamba-based model for efficient urban spatio-temporal prediction. Damba-ST retains Mamba's linear complexity advantage while significantly enhancing its adaptability to heterogeneous domains. Specifically, we introduce two core innovations: (1) a domain-adaptive state space model that partitions the latent representation space into a shared subspace for learning cross-domain commonalities and independent, domain-specific subspaces for capturing intra-domain discriminative features; (2) three distinct Domain Adapters, which serve as domain-aware proxies to bridge disparate domain distributions and facilitate the alignment of cross-domain commonalities. Extensive experiments demonstrate the generalization and efficiency of Damba-ST. It achieves state-of-the-art performance on prediction tasks and demonstrates strong zero-shot generalization, enabling seamless deployment in new urban environments without extensive retraining or fine-tuning.

Damba-ST: Domain-Adaptive Mamba for Efficient Urban Spatio-Temporal Prediction

TL;DR

This work tackles cross-city urban spatio-temporal forecasting under domain heterogeneity and Transformer-era computational constraints. It introduces Damba-ST, a Domain-Adaptive Mamba backbone built around a Domain-Adaptive State Space Model (DASSM) with Domain Adapters, and three complementary views (Spatial, Temporal, ST-delay) processed via Intra-Domain Scanning and Cross-Domain Adaptation. The model learns domain-specific discriminative patterns while aligning cross-domain commonalities, achieving linear complexity in sequence length and strong zero-shot generalization across regions, cities, and tasks. Empirical results show state-of-the-art performance and practical deployment benefits, including fast inference on real-world data, enabling robust, plug-and-play urban forecasting without extensive retraining.

Abstract

Training urban spatio-temporal foundation models that generalize well across diverse regions and cities is critical for deploying urban services in unseen or data-scarce regions. Recent studies have typically focused on fusing cross-domain spatio-temporal data to train unified Transformer-based models. However, these models suffer from quadratic computational complexity and high memory overhead, limiting their scalability and practical deployment. Inspired by the efficiency of Mamba, a state space model with linear time complexity, we explore its potential for efficient urban spatio-temporal prediction. However, directly applying Mamba as a spatio-temporal backbone leads to negative transfer and severe performance degradation. This is primarily due to spatio-temporal heterogeneity and the recursive mechanism of Mamba's hidden state updates, which limit cross-domain generalization. To overcome these challenges, we propose Damba-ST, a novel domain-adaptive Mamba-based model for efficient urban spatio-temporal prediction. Damba-ST retains Mamba's linear complexity advantage while significantly enhancing its adaptability to heterogeneous domains. Specifically, we introduce two core innovations: (1) a domain-adaptive state space model that partitions the latent representation space into a shared subspace for learning cross-domain commonalities and independent, domain-specific subspaces for capturing intra-domain discriminative features; (2) three distinct Domain Adapters, which serve as domain-aware proxies to bridge disparate domain distributions and facilitate the alignment of cross-domain commonalities. Extensive experiments demonstrate the generalization and efficiency of Damba-ST. It achieves state-of-the-art performance on prediction tasks and demonstrates strong zero-shot generalization, enabling seamless deployment in new urban environments without extensive retraining or fine-tuning.

Paper Structure

This paper contains 55 sections, 2 theorems, 30 equations, 5 figures, 3 tables.

Key Result

Theorem 1

(Generalization Capability of Learnable Prompt $P$) Given a Spatio-Temporal model $\mathcal{F}_\Theta$, and the input spatio-temporal data $(\mathcal{G},\mathcal{X})$. For any domain generalization transformation function $g \colon \mathbb{D}_{\text{specific}} \rightarrow \mathbb{D}_{\text{common}}$

Figures (5)

  • Figure 1: Spatio-Temporal Heterogeneity and Our Motivation. (a) Regions A and B represent a transportation hub and residential area, each with distinct urban functions. Traffic patterns vary significantly over time (e.g., morning to evening and weekdays to holidays). (b) Distinct regions exhibit notable differences in traffic flow volume. (c) Such spatio-temporal heterogeneity leads to significant discrepancies in dataset distributions. (d) Given cross-domain data $\{\mathcal{D}_1,\mathcal{D}_2, \mathcal{D}_3\}$, we propose partitioning the Mamba representation space into a shared subspace for cross-domain commonalities, denoted as $C$, and independent subspaces for domain-specific patterns, i.e., $\{S_1, S_2, S_3\}$.
  • Figure 2: Overall framework. Damba-ST comprises three key modules: Multi-View Encoding (MVE), Intra-Domain Scanning (IDS), and Cross-Domain Adaptation (CDA). The CDA module integrates three variants of the Domain-Adaptive State Space Model (DASSM): Spatial, Temporal, and ST-Delay DASSM. Each variant contains a Discrimination Learner, an Adapter Learner, and a Commonalities Learner.
  • Figure 3: Ablation Study on the CAD3 and NYC-BIKE Datasets
  • Figure 4: Efficiency analysis of running time and GPU memory in long-term prediction. Damba-ST scales linearly with the series length.
  • Figure 5: Hyperparameter Analysis.

Theorems & Definitions (2)

  • Theorem 1
  • Proposition 1