On the Benefits of Memory for Modeling Time-Dependent PDEs
Ricardo Buitrago Ruiz, Tanya Marwah, Albert Gu, Andrej Risteski
TL;DR
The paper addresses the challenge of modeling time-dependent PDEs when observations are partial or noisy, arguing that memory of past states can significantly improve predictions in such regimes. It introduces Memory Neural Operator (MemNO), a modular framework that combines a Markovian spatial operator with a memory layer (exemplified by S4) to capture temporal dependencies, and instantiates it as S4FFNO. Theoretical motivation shows memory terms can have arbitrarily large impact in idealized linear settings, while empirical results demonstrate memory-based models outperform memoryless baselines by up to 6x error reduction in low-resolution and high-frequency scenarios, with robust improvements under observation noise in 2D Navier–Stokes. The findings suggest memory-augmented neural operators are particularly valuable for PDE benchmarks with substantial high-frequency content and incomplete observations, enabling more accurate and efficient data-driven solvers in practical settings.
Abstract
Data-driven techniques have emerged as a promising alternative to traditional numerical methods for solving PDEs. For time-dependent PDEs, many approaches are Markovian -- the evolution of the trained system only depends on the current state, and not the past states. In this work, we investigate the benefits of using memory for modeling time-dependent PDEs: that is, when past states are explicitly used to predict the future. Motivated by the Mori-Zwanzig theory of model reduction, we theoretically exhibit examples of simple (even linear) PDEs, in which a solution that uses memory is arbitrarily better than a Markovian solution. Additionally, we introduce Memory Neural Operator (MemNO), a neural operator architecture that combines recent state space models (specifically, S4) and Fourier Neural Operators (FNOs) to effectively model memory. We empirically demonstrate that when the PDEs are supplied in low resolution or contain observation noise at train and test time, MemNO significantly outperforms the baselines without memory -- with up to 6x reduction in test error. Furthermore, we show that this benefit is particularly pronounced when the PDE solutions have significant high-frequency Fourier modes (e.g., low-viscosity fluid dynamics) and we construct a challenging benchmark dataset consisting of such PDEs.
