Latent Mamba Operator for Partial Differential Equations
Karn Tiwari, Niladri Dutta, N M Anoop Krishnan, Prathosh A P
TL;DR
Latent Mamba Operator (LaMO) introduces a latent-space, bidirectional state-space model framework for neural operators to solve parametric PDEs with high efficiency and accuracy. By encoding PDE inputs into latent tokens, applying latent SSMs to learn data-dependent kernel integrals, and decoding back to the physical domain, LaMO achieves linear computational complexity in the number of mesh points and delivers state-of-the-art performance across diverse benchmarks. The paper provides theoretical connections between SSMs and kernel-integral operators, and demonstrates substantial empirical gains (average 32.3% improvement over SOTA baselines) on regular grids, structured meshes, and point clouds, including challenging time-dependent flows. These results suggest LaMO’s potential as a scalable foundation model for SciML PDE solving, with future work focusing on pretraining, unsupervised learning, and enhanced scanning strategies.
Abstract
Neural operators have emerged as powerful data-driven frameworks for solving Partial Differential Equations (PDEs), offering significant speedups over numerical methods. However, existing neural operators struggle with scalability in high-dimensional spaces, incur high computational costs, and face challenges in capturing continuous and long-range dependencies in PDE dynamics. To address these limitations, we introduce the Latent Mamba Operator (LaMO), which integrates the efficiency of state-space models (SSMs) in latent space with the expressive power of kernel integral formulations in neural operators. We also establish a theoretical connection between state-space models (SSMs) and the kernel integral of neural operators. Extensive experiments across diverse PDE benchmarks on regular grids, structured meshes, and point clouds covering solid and fluid physics datasets, LaMOs achieve consistent state-of-the-art (SOTA) performance, with a 32.3% improvement over existing baselines in solution operator approximation, highlighting its efficacy in modeling complex PDE solutions.
