Table of Contents
Fetching ...

Merging Memory and Space: A State Space Neural Operator

Nodens Koren, Samuel Lanthaler

TL;DR

This formulation extends structured state space models (SSMs) to joint spatiotemporal modeling, introducing two key mechanisms: adaptive damping, which stabilizes learning by localizing receptive fields, and learnable frequency modulation, which enables data-driven spectral selection.

Abstract

We propose the *State Space Neural Operator* (SS-NO), a compact architecture for learning solution operators of time-dependent partial differential equations (PDEs). Our formulation extends structured state space models (SSMs) to joint spatiotemporal modeling, introducing two key mechanisms: *adaptive damping*, which stabilizes learning by localizing receptive fields, and *learnable frequency modulation*, which enables data-driven spectral selection. These components provide a unified framework for capturing long-range dependencies with parameter efficiency. Theoretically, we establish connections between SSMs and neural operators, proving a universality theorem for convolutional architectures with full field-of-view. Empirically, SS-NO achieves state-of-the-art performance across diverse PDE benchmarks-including 1D Burgers' and Kuramoto-Sivashinsky equations, and 2D Navier-Stokes and compressible Euler flows-while using significantly fewer parameters than competing approaches. A factorized variant of SS-NO further demonstrates scalable performance on challenging 2D problems. Our results highlight the effectiveness of damping and frequency learning in operator modeling, while showing that lightweight factorization provides a complementary path toward efficient large-scale PDE learning.

Merging Memory and Space: A State Space Neural Operator

TL;DR

This formulation extends structured state space models (SSMs) to joint spatiotemporal modeling, introducing two key mechanisms: adaptive damping, which stabilizes learning by localizing receptive fields, and learnable frequency modulation, which enables data-driven spectral selection.

Abstract

We propose the *State Space Neural Operator* (SS-NO), a compact architecture for learning solution operators of time-dependent partial differential equations (PDEs). Our formulation extends structured state space models (SSMs) to joint spatiotemporal modeling, introducing two key mechanisms: *adaptive damping*, which stabilizes learning by localizing receptive fields, and *learnable frequency modulation*, which enables data-driven spectral selection. These components provide a unified framework for capturing long-range dependencies with parameter efficiency. Theoretically, we establish connections between SSMs and neural operators, proving a universality theorem for convolutional architectures with full field-of-view. Empirically, SS-NO achieves state-of-the-art performance across diverse PDE benchmarks-including 1D Burgers' and Kuramoto-Sivashinsky equations, and 2D Navier-Stokes and compressible Euler flows-while using significantly fewer parameters than competing approaches. A factorized variant of SS-NO further demonstrates scalable performance on challenging 2D problems. Our results highlight the effectiveness of damping and frequency learning in operator modeling, while showing that lightweight factorization provides a complementary path toward efficient large-scale PDE learning.

Paper Structure

This paper contains 89 sections, 3 theorems, 62 equations, 12 figures, 8 tables.

Key Result

Theorem 4.1

A (factorized) convolutional NO architecture is universal if it has a full field of view.

Figures (12)

  • Figure 1: Detailed illustration of the spatial bidirectional SSM module. $B$: batch size, $T$: temporal length, $X$: spatial dimensions, $C$: input channels, $\sigma$: pointwise nonlinearity, and $+$: element-wise addition. The input is processed through both a forward spatial SSM and a flipped backward spatial SSM. Each path includes a residual connection and nonlinear activation, and their outputs are aggregated to form the final output.
  • Figure 2: Architecture combining Markovian 1D spatial SSM modules with a single temporal SSM following the MemNO framework. The spatial SSMs are applied sequentially across spatial dimensions, while the temporal SSM's position within the stack is a tunable hyperparameter.
  • Figure 3: Resolution Dependence Analysis. Relative $\ell_2$ error of SS-NO and baseline models on 1D benchmarks across varying spatial resolutions ($N \in \{32, \dots, 512\}$). Left three panels: Kuramoto-Sivashinsky (KS) equation with increasing viscosity coefficients $\nu \in \{0.075, 0.1, 0.125\}$. Right panel: Burgers' equation ($\nu=0.001$), limited to $N=128$ due to dataset constraints. While U-Net performance stagnates at higher resolutions ($N \ge 128$), operator learning methods generally improve. Notably, SS-NO (Ours) achieves superior accuracy early at $N=128$ and maintains the lowest error floor at $N=512$, demonstrating robust resolution efficiency.
  • Figure 4: Damping coefficient distributions for full vs. damping-only models at $\nu = 0.075$ with 64 states.
  • Figure 5: Damping coefficient distributions for $\nu = 0.125$ vs. $\nu = 0.075$ models with 64 states.
  • ...and 7 more figures

Theorems & Definitions (7)

  • Definition
  • Theorem 4.1
  • Definition : rigorous
  • Remark 1
  • Theorem B.1
  • proof
  • Lemma B.2