Towards Multi-spatiotemporal-scale Generalized PDE Modeling

Jayesh K. Gupta; Johannes Brandstetter

Towards Multi-spatiotemporal-scale Generalized PDE Modeling

Jayesh K. Gupta, Johannes Brandstetter

TL;DR

This work benchmarks multiple neural PDE surrogates—Fourier Neural Operators, ResNets, and U-Nets—for multi-scale fluid dynamics, focusing on generalization across PDE parameters and time-scales. It introduces parameter-conditioning strategies and a PyTorch-based benchmarking framework to enable fair comparisons, showing that U-Net–style architectures frequently dominate in performance, with selective gains from incorporating FNO components. The study situates these models within the operator-learning paradigm, highlighting how different architectures capture local vs. global spatiotemporal information. The findings offer practical guidance for building robust, generalizable neural surrogates for PDEs and provide open-source resources to the community.

Abstract

Partial differential equations (PDEs) are central to describing complex physical system simulations. Their expensive solution techniques have led to an increased interest in deep neural network based surrogates. However, the practical utility of training such surrogates is contingent on their ability to model complex multi-scale spatio-temporal phenomena. Various neural network architectures have been proposed to target such phenomena, most notably Fourier Neural Operators (FNOs), which give a natural handle over local & global spatial information via parameterization of different Fourier modes, and U-Nets which treat local and global information via downsampling and upsampling paths. However, generalizing across different equation parameters or time-scales still remains a challenge. In this work, we make a comprehensive comparison between various FNO, ResNet, and U-Net like approaches to fluid mechanics problems in both vorticity-stream and velocity function form. For U-Nets, we transfer recent architectural improvements from computer vision, most notably from object segmentation and generative modeling. We further analyze the design considerations for using FNO layers to improve performance of U-Net architectures without major degradation of computational cost. Finally, we show promising results on generalization to different PDE parameters and time-scales with a single surrogate model. Source code for our PyTorch benchmark framework is available at https://github.com/microsoft/pdearena.

Towards Multi-spatiotemporal-scale Generalized PDE Modeling

TL;DR

Abstract

Paper Structure (29 sections, 10 equations, 22 figures, 10 tables)

This paper contains 29 sections, 10 equations, 22 figures, 10 tables.

Introduction
Preliminaries
PDE Surrogates
Operator learning
Experiments
Probing parameter conditioning.
Conclusion
Related work
Experiments
Experimental details
Loss functions and metrics.
Training and model selection.
Computational resources.
Runtime comparison.
Additional model details
...and 14 more sections

Figures (22)

Figure 1: Example rollout trajectories of the best-performing U-Net model, which is trained to generalize across different timesteps ($\Delta t$) and different force terms.
Figure 2: Information flow in Fourier based (left) and U-Net based architectures (right). FNO layers li2020fourier consist of Fast Fourier transforms and weight multiplication in the Fourier space. Low Fourier modes provide global and high Fourier modes provide local information. U-Nets ronneberger2015u are constructed as a spatial downsampling pass, followed by a spatial upsampling pass, where information from the downsampling pass is added via skip-connections.
Figure 3: Analyzing filter properties of trained U-Net architectures. Absolute values of Fourier modes of the filters in each first layer of the respective down-sampling blocks are shown, where for each mode the average is taken over all filters.
Figure 4: One-step errors for modeling different PDEs, shown for different number of training trajectories. Results are averaged over three different random seeds, and are obtained for the velocity function and vorticity stream formulation of the shallow water equations on $2$-day prediction (left, middle), and for the Navier-Stokes equation (right). For better visibility only selected architectures are displayed, for full comparisons see Appendix \ref{['app:experiments']}. Note the logarithmic scale of the $y$-axes.
Figure 5: One-step errors obtained on the parameter conditioning experiments of the Navier-Stokes equation. Results are shown for selected architectures, different number of training trajectories, and different time windows: $\Delta t = 0.375s$ (left), $\Delta t = 1.5s$ (middle), and $\Delta t = 3s$ (right). Results are averaged over $208$ different unseen evaluation buoyancy force values between $0.2$ and $0.5$.
...and 17 more figures

Towards Multi-spatiotemporal-scale Generalized PDE Modeling

TL;DR

Abstract

Towards Multi-spatiotemporal-scale Generalized PDE Modeling

Authors

TL;DR

Abstract

Table of Contents

Figures (22)