Automated machine learning for physics-informed convolutional neural networks

Wanyun Zhou; Haoze Song; Xiaowen Chu

Automated machine learning for physics-informed convolutional neural networks

Wanyun Zhou, Haoze Song, Xiaowen Chu

TL;DR

This work tackles the challenge of manually designing physics-informed CNNs (PICNNs) for parametric PDEs by introducing Auto-PICNN, an AutoML framework that automatically searches both loss functions and network architectures. It couples an operator-infused loss-function space with an entire-structured CNN space and employs a two-stage search (Bayesian optimization for losses, policy-based RL for architectures) to optimize PICNNs, including a ConvLSTM module to handle spatiotemporal dynamics. Empirically, Auto-PICNN substantially outperforms manually designed PICNN baselines and neural operators across six PDE systems, achieving up to a $59.78$-fold reduction in error on some tasks and an average $13.31$-fold improvement. The framework demonstrates notable efficiency in search time (roughly two days at worst) and shows robustness across diverse PDEs, offering a practical path to deploying physics-informed models without expert neural architecture search expertise.

Abstract

Recent advances in deep learning for solving partial differential equations (PDEs) have introduced physics-informed neural networks (PINNs), which integrate machine learning with physical laws. Physics-informed convolutional neural networks (PICNNs) extend PINNs by leveraging CNNs for enhanced generalization and efficiency. However, current PICNNs depend on manual design, and inappropriate designs may not effectively solve PDEs. Furthermore, due to the diversity of physical problems, the ideal network architectures and loss functions vary across different PDEs. It is impractical to find the optimal PICNN architecture and loss function for each specific physical problem through extensive manual experimentation. To surmount these challenges, this paper uses automated machine learning (AutoML) to automatically and efficiently search for the loss functions and network architectures of PICNNs. We introduce novel search spaces for loss functions and network architectures and propose a two-stage search strategy. The first stage focuses on searching for factors and residual adjustment operations that influence the loss function, while the second stage aims to find the best CNN architecture. Experimental results show that our automatic searching method significantly outperforms the manually-designed model on multiple datasets.

Automated machine learning for physics-informed convolutional neural networks

TL;DR

-fold reduction in error on some tasks and an average

-fold improvement. The framework demonstrates notable efficiency in search time (roughly two days at worst) and shows robustness across diverse PDEs, offering a practical path to deploying physics-informed models without expert neural architecture search expertise.

Abstract

Paper Structure (26 sections, 21 equations, 7 figures, 4 tables)

This paper contains 26 sections, 21 equations, 7 figures, 4 tables.

Introduction
Results
Overall framework of Auto-PICNN
Search objective
Benchmarks and Baselines
Comparison with the baseline models
Loss function comparison
Architecture search Space comparison
Search strategies comparison
Discussion
Methods
Physics-Informed Convolutional Neural Networks (PICNNs)
Loss function construction
Boundary conditions
Calculation of spatial derivatives
...and 11 more sections

Figures (7)

Figure 1: Schematic illustration of the automated framework for physics-informed convolutional neural network design. The framework leverages automated machine learning (AutoML) with a two-stage search strategy to identify optimal network architectures and loss functions for training. Once trained, the model can directly infer solutions for parametric PDEs with any unseen parameter configurations without requiring retraining.
Figure 2: Comparison of physics-driven loss and prediction error across different network architectures. Results are obtained from 100 randomly selected network architectures trained on the parametric heat equation. (a) presents the PDE-driven loss on the training set and the relative $L^{2}$ error on the test set, while (b) compares the relative $L^{2}$ errors on the training and test sets.
Figure 3: Prediction error curves on the validation set during training for baseline models and our model on different partial differential equation datasets. (a), (b), and (c) show the relative $L^{2}$ error of our model termed Auto-PICNN compared to the baseline models PhyGeoNet, PDE-surrogate, and PhyCRNet for the heat equations, Darcy flow, and 2D Burgers' equations, respectively. (d), (e), and (f) compare the relative $L^{2}$ error obtained using three different loss functions including our searched loss function, baseline models' loss function, and vanilla MSE loss for the heat equations, Darcy flow, and 2D Burgers’ equations, respectively.
Figure 4: Visualizations of the predicted solutions and point-wise prediction errors of our model and baseline models on different partial differential equation datasets. (a), (b), and (c) visualize the predicted solutions and their corresponding point-wise errors of our model termed Auto-PICNN and baseline models PhyGeoNet, PDE-surrogate, and PhyCRNet on the heat equations, Darcy flow, and 2D Burgers' equations, respectively. (d), (e), and (f) visualize the predicted solutions and point-wise errors obtained using different loss functions including our searched loss function, baseline models' loss function, and vanilla MSE loss for the heat equations, Darcy flow, and 2D Burgers' equations, respectively. For the ground truth and prediction visualizations, colorbars indicate the physical field's magnitude, representing temperature for heat equations, pressure for Darcy flow, and fluid velocity for Burgers' equations. In the error visualizations, colorbars represent the point-wise absolute error between predictions and ground truth.
Figure 5: Reinforcement learning-based network architecture search process for the heat equation. Subfigures (a), (b), (c), (d), and (e) illustrate the search process for gradually increasing sizes of the training dataset, with training sample sizes of 2, 3, 4, 5, and 6 samples, respectively. Each subfigure shows how the reward (reciprocal of relative $L^{2}$ error) evolves during the reinforcement learning process as the controller learns to generate better network architectures.
...and 2 more figures

Automated machine learning for physics-informed convolutional neural networks

TL;DR

Abstract

Automated machine learning for physics-informed convolutional neural networks

Authors

TL;DR

Abstract

Table of Contents

Figures (7)