Masked Autoencoders are PDE Learners

Anthony Zhou; Amir Barati Farimani

Masked Autoencoders are PDE Learners

Anthony Zhou, Amir Barati Farimani

TL;DR

This work applies masked autoencoder pretraining to a diverse set of 1D and 2D PDEs to learn latent physics representations without labeled data. The MAE encoder captures meaningful structure across coefficients, boundary conditions, and discretizations, enabling downstream PDE feature prediction and conditioning of neural solvers for time-stepping and super-resolution. Results show latent structure aligns with PDE properties and improves downstream performance within the pretraining distribution, though extrapolation to unseen equations remains challenging. The approach offers a scalable path toward unified latent physics representations from heterogeneous, unlabeled PDE data and suggests directions for scaling and latent arithmetic in physics-informed learning.

Abstract

Neural solvers for partial differential equations (PDEs) have great potential to generate fast and accurate physics solutions, yet their practicality is currently limited by their generalizability. PDEs evolve over broad scales and exhibit diverse behaviors; predicting these phenomena will require learning representations across a wide variety of inputs which may encompass different coefficients, boundary conditions, resolutions, or even equations. As a step towards generalizable PDE modeling, we adapt masked pretraining for physics problems. Through self-supervised learning across PDEs, masked autoencoders can consolidate heterogeneous physics to learn rich latent representations. We show that learned representations can generalize to a limited set of unseen equations or parameters and are meaningful enough to regress PDE coefficients or the classify PDE features. Furthermore, conditioning neural solvers on learned latent representations can improve time-stepping and super-resolution performance across a variety of coefficients, discretizations, or boundary conditions, as well as on certain unseen PDEs. We hope that masked pretraining can emerge as a unifying method across large, unlabeled, and heterogeneous datasets to learn latent physics at scale.

Masked Autoencoders are PDE Learners

TL;DR

Abstract

Paper Structure (36 sections, 4 equations, 8 figures, 17 tables)

This paper contains 36 sections, 4 equations, 8 figures, 17 tables.

Introduction
Related Work
Neural PDE Solvers
Pretraining for PDEs
Masked Pretraining
Situating our Contribution
Methods
Masked Pretraining for PDEs
Lie Point Symmetry Data Augmentation
Multi-Resolution Pretraining
Experimental Setup
PDEs and Datasets
Data Augmentations
Results
MAE Pretraining
...and 21 more sections

Figures (8)

Figure 1: We investigate learning diverse PDE dynamics with masked autoencoders (MAE) and using learned representations to benefit various downstream tasks. (Masked Pretraining) An encoder is trained on unmasked patches of spatiotemporal PDE data, while a decoder reconstructs true data from latent encodings and learned mask tokens. (Supervised Fine-tuning) Pretrained encoders can be used to quickly regress equation coefficients or predict key PDE features. (Conditional Time-stepping) Neural solvers can achieve higher accuracy predictions through conditioning on MAE encodings. (Conditional Super-resolution) SR models can also benefit from conditioning on MAE encodings, using a discretization inversion ($D^{-1}$) and neural operator yang_superres to predict high-resolution physics.
Figure 2: Example results after training on the 1D KdV-Burgers equation with a masking ratio of 75%. For each triplet, we show the masked PDE (left), the MAE reconstruction (middle), and the ground-truth (right), and plot space and time on the $x$ and $y$ axes respectively. The MAE can reconstruct multiple resolutions of KdV-Burgers data and interpolate to the 1D Heat and inviscid Burgers equations. For the 1D Advection and KS equations, which contain novel PDE terms ($u_x, u_{xxxx}$), the extrapolation performance is limited.
Figure 4: t-SNE embeddings of various PDEs. Plots show embeddings before and after using the MAE to encode samples, shown on the top and bottom. The MAE latent space shows structure despite not seeing labels of coefficients, PDEs, or BCs. A: 1D KdV-Burgers equation, colored by $\alpha$. B: 1D Advection equation, colored by $c$. C: 1D Heat, Burgers, Advection, and KS equations, colored by PDE. D: 1D KdV-Burgers equation, colored by resolution. E: 1D Heat equation, colored by boundary condition. F: 2D Heat, Advection and Burgers equations, colored by $\nu$ and $c$.
Figure 5: Additional 1D MAE Reconstruction examples after pretraining on the 1D KdV-Burgers equation. Each triplet is shown with the masked sample (Left), MAE reconstruction (Middle), and ground truth PDE (right). We include additional reconstructions of unseen boundary conditions for the Heat and Wave equations.
Figure 6: Additional 2D MAE Reconstruction examples after pretraining on the 2D Heat, Advection, and Burgers Equations. Each sample is shown with the masked sample (Top), MAE reconstruction (Middle), and ground truth PDE (Bottom). We include sample MAE predictions at variable resolutions for the 2D Heat, Advection, and Burgers equations; the lowest resolution (top) is $(48, 48)$, the medium resolution (middle) is $(52, 52)$, and the high resolution (bottom) is $(56, 56)$ We include additional reconstructions of the incompressible NS equations at the native resolution $(64, 64)$.
...and 3 more figures

Masked Autoencoders are PDE Learners

TL;DR

Abstract

Masked Autoencoders are PDE Learners

Authors

TL;DR

Abstract

Table of Contents

Figures (8)