PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

Makoto Takamoto; Timothy Praditia; Raphael Leiteritz; Dan MacKinlay; Francesco Alesiani; Dirk Pflüger; Mathias Niepert

PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Dan MacKinlay, Francesco Alesiani, Dirk Pflüger, Mathias Niepert

TL;DR

<3-5 sentence high-level summary> PDEBench introduces a comprehensive benchmark for Scientific Machine Learning by providing $11$ PDEs across 1D–3D, with $35$ datasets and an extensible API for data generation and evaluation. It formalizes forward surrogates as $\mathfrak{F}_\theta$ and inverse problems, and benchmarks multiple baselines including U-Net, Fourier Neural Operator (FNO), PINNs, and a Gradient-Based Inverse Method using physics-aware metrics. The experiments show there is no single-size-fits-all method; FNO typically achieves lower RMSE and better operator learning, while autoregressive U-Nets require stabilization to avoid instability, and certain high-frequency regimes remain challenging. The work provides ready-to-run data, code, and evaluation protocols to enable reproducible benchmarking and community-driven expansion, accelerating progress in Scientific ML.

Abstract

Machine learning-based modeling of physical systems has experienced increased interest in recent years. Despite some impressive progress, there is still a lack of benchmarks for Scientific ML that are easy to use but still challenging and representative of a wide range of problems. We introduce PDEBench, a benchmark suite of time-dependent simulation tasks based on Partial Differential Equations (PDEs). PDEBench comprises both code and data to benchmark the performance of novel machine learning models against both classical numerical simulations and machine learning baselines. Our proposed set of benchmark problems contribute the following unique features: (1) A much wider range of PDEs compared to existing benchmarks, ranging from relatively common examples to more realistic and difficult problems; (2) much larger ready-to-use datasets compared to prior work, comprising multiple simulation runs across a larger number of initial and boundary conditions and PDE parameters; (3) more extensible source codes with user-friendly APIs for data generation and baseline results with popular machine learning models (FNO, U-Net, PINN, Gradient-Based Inverse Method). PDEBench allows researchers to extend the benchmark freely for their own purposes using a standardized API and to compare the performance of new models to existing baseline methods. We also propose new evaluation metrics with the aim to provide a more holistic understanding of learning methods in the context of Scientific ML. With those metrics we identify tasks which are challenging for recent ML methods and propose these tasks as future challenges for the community. The code is available at https://github.com/pdebench/PDEBench.

PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

TL;DR

<3-5 sentence high-level summary> PDEBench introduces a comprehensive benchmark for Scientific Machine Learning by providing

PDEs across 1D–3D, with

datasets and an extensible API for data generation and evaluation. It formalizes forward surrogates as

and inverse problems, and benchmarks multiple baselines including U-Net, Fourier Neural Operator (FNO), PINNs, and a Gradient-Based Inverse Method using physics-aware metrics. The experiments show there is no single-size-fits-all method; FNO typically achieves lower RMSE and better operator learning, while autoregressive U-Nets require stabilization to avoid instability, and certain high-frequency regimes remain challenging. The work provides ready-to-run data, code, and evaluation protocols to enable reproducible benchmarking and community-driven expansion, accelerating progress in Scientific ML.

Abstract

Paper Structure (57 sections, 22 equations, 34 figures, 20 tables)

This paper contains 57 sections, 22 equations, 34 figures, 20 tables.

Motivation
Related Work
PDEBench: A Benchmark for Scientific Machine Learning
General Problem Definition
Overview of Datasets and PDEs
Compressible Navier-Stokes equations
Incompressible Navier-Stokes equations
Shallow-Water Equations
Overview of Metrics
Existing Baseline Surrogate Models
U-Net
Fourier neural operator (FNO)
Physics-Informed Neural Networks (PINNs)
Gradient-Based Inverse Method
Data Format, Benchmark Access, Maintenance, and Extensibility
...and 42 more sections

Figures (34)

Figure 1: PDEBench provides multiple non-trivial challenges from the Sciences to benchmark current and future ML methods, including wave propagation and turbulent flow in 2D and 3D
Figure 2: Comparisons of baseline models' performance for different problems for (a) the forward problem and (b) the inverse problem.
Figure 3: Detailed visualization of (a) Burgers', (b) DarcyFlow, and (c) Compressible NS eqs.
Figure 4: (a) Visualization of the 2D diffusion-reaction data generated with a standard finite volume (FVM) solver and a resolution of $128^2$, (b) FNO prediction, and (c) U-Net prediction.
Figure 5: (a) Plots of the RMSE calculated at different unrolled time steps, (b) comparison of each autoregressive method, and (c) RMSE for temporal extrapolation.
...and 29 more figures

PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

TL;DR

Abstract

PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (34)