Table of Contents
Fetching ...

Active Learning for Neural PDE Solvers

Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller, Makoto Takamoto, Mathias Niepert

TL;DR

Neural PDE solvers are powerful but data-intensive, motivating the need for efficient data acquisition. We introduce AL4PDE, a modular benchmark for pool-based active learning in neural PDE solvers, encompassing multiple PDEs, neural surrogates (U‑Net, SineNet, FNO), and batch acquisition strategies (uncertainty- and feature-based). Our experiments show that active learning can reduce average errors by up to $71\%$ and mitigate worst-case errors, while producing stable data distributions across seeds and enabling data reuse for different models. This framework provides a practical pathway to data-efficient, reliable neural PDE solvers and a platform for developing PDE-specific AL methods with real-world relevance.

Abstract

Solving partial differential equations (PDEs) is a fundamental problem in science and engineering. While neural PDE solvers can be more efficient than established numerical solvers, they often require large amounts of training data that is costly to obtain. Active learning (AL) could help surrogate models reach the same accuracy with smaller training sets by querying classical solvers with more informative initial conditions and PDE parameters. While AL is more common in other domains, it has yet to be studied extensively for neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and extensible active learning benchmark. It provides multiple parametric PDEs and state-of-the-art surrogate models for the solver-in-the-loop setting, enabling the evaluation of existing and the development of new AL methods for neural PDE solving. We use the benchmark to evaluate batch active learning algorithms such as uncertainty- and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation.

Active Learning for Neural PDE Solvers

TL;DR

Neural PDE solvers are powerful but data-intensive, motivating the need for efficient data acquisition. We introduce AL4PDE, a modular benchmark for pool-based active learning in neural PDE solvers, encompassing multiple PDEs, neural surrogates (U‑Net, SineNet, FNO), and batch acquisition strategies (uncertainty- and feature-based). Our experiments show that active learning can reduce average errors by up to and mitigate worst-case errors, while producing stable data distributions across seeds and enabling data reuse for different models. This framework provides a practical pathway to data-efficient, reliable neural PDE solvers and a platform for developing PDE-specific AL methods with real-world relevance.

Abstract

Solving partial differential equations (PDEs) is a fundamental problem in science and engineering. While neural PDE solvers can be more efficient than established numerical solvers, they often require large amounts of training data that is costly to obtain. Active learning (AL) could help surrogate models reach the same accuracy with smaller training sets by querying classical solvers with more informative initial conditions and PDE parameters. While AL is more common in other domains, it has yet to be studied extensively for neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and extensible active learning benchmark. It provides multiple parametric PDEs and state-of-the-art surrogate models for the solver-in-the-loop setting, enabling the evaluation of existing and the development of new AL methods for neural PDE solving. We use the benchmark to evaluate batch active learning algorithms such as uncertainty- and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation.
Paper Structure (45 sections, 14 equations, 28 figures, 10 tables)

This paper contains 45 sections, 14 equations, 28 figures, 10 tables.

Figures (28)

  • Figure 1: An extensible benchmark framework for pool-based active learning for neural PDE solvers.
  • Figure 2: Structural overview of the AL4PDE benchmark.
  • Figure 3: Example trajectories of the PDEs.
  • Figure 4: Error over the number of trajectories in the training set (N). The shaded area represents the $95 \%$ confidence interval of the mean calculated over multiple seeds. AL can reduce the error relative to random sampling of the inputs on all tested PDEs but CNS, where the difference was not significant.
  • Figure 5: Error quantiles over the number of trajectories in the training set (N). The 50%, 95%, and 99% quantiles are displayed using full, dashed, and dotted lines, respectively. AL especially improves the higher error quantiles, making the trained model more reliable.
  • ...and 23 more figures