Table of Contents
Fetching ...

Training Variation of Physically-Informed Deep Learning Models

Ashley Lenau, Dennis Dimiduk, Stephen R. Niezgoda

TL;DR

The paper addresses training variability and reproducibility in physics-informed deep learning for enforcing boundary conditions. It compares baseline losses with three physics-informed regularizations using a Pix2Pix GAN to predict stress fields in high-contrast two-phase composites, quantified across 30 independent trainings with metrics $MSE_{\sigma}$ and $MSE_{equil}$. The results show that physics-informed losses reduce variation in $MSE_{equil}$ and, for some methods, $MSE_{\sigma}$ as well, and that approximately 15 trainings are sufficient to estimate variability via bootstrap. The work emphasizes reporting model variation for fair comparisons and provides guidelines applicable to ML in small-data materials science.

Abstract

A successful deep learning network is highly dependent not only on the training dataset, but the training algorithm used to condition the network for a given task. The loss function, dataset, and tuning of hyperparameters all play an essential role in training a network, yet there is not much discussion on the reliability or reproducibility of a training algorithm. With the rise in popularity of physics-informed loss functions, this raises the question of how reliable one's loss function is in conditioning a network to enforce a particular boundary condition. Reporting the model variation is needed to assess a loss function's ability to consistently train a network to obey a given boundary condition, and provides a fairer comparison among different methods. In this work, a Pix2Pix network predicting the stress fields of high elastic contrast composites is used as a case study. Several different loss functions enforcing stress equilibrium are implemented, with each displaying different levels of variation in convergence, accuracy, and enforcing stress equilibrium across many training sessions. Suggested practices in reporting model variation are also shared.

Training Variation of Physically-Informed Deep Learning Models

TL;DR

The paper addresses training variability and reproducibility in physics-informed deep learning for enforcing boundary conditions. It compares baseline losses with three physics-informed regularizations using a Pix2Pix GAN to predict stress fields in high-contrast two-phase composites, quantified across 30 independent trainings with metrics and . The results show that physics-informed losses reduce variation in and, for some methods, as well, and that approximately 15 trainings are sufficient to estimate variability via bootstrap. The work emphasizes reporting model variation for fair comparisons and provides guidelines applicable to ML in small-data materials science.

Abstract

A successful deep learning network is highly dependent not only on the training dataset, but the training algorithm used to condition the network for a given task. The loss function, dataset, and tuning of hyperparameters all play an essential role in training a network, yet there is not much discussion on the reliability or reproducibility of a training algorithm. With the rise in popularity of physics-informed loss functions, this raises the question of how reliable one's loss function is in conditioning a network to enforce a particular boundary condition. Reporting the model variation is needed to assess a loss function's ability to consistently train a network to obey a given boundary condition, and provides a fairer comparison among different methods. In this work, a Pix2Pix network predicting the stress fields of high elastic contrast composites is used as a case study. Several different loss functions enforcing stress equilibrium are implemented, with each displaying different levels of variation in convergence, accuracy, and enforcing stress equilibrium across many training sessions. Suggested practices in reporting model variation are also shared.

Paper Structure

This paper contains 8 sections, 13 figures, 4 tables.

Figures (13)

  • Figure 1: The $\text{MSE}_{\sigma}$ (left) and $\text{MSE}_{equil}$ (right) throughout training for each of the 30 training sessions. The solid black line indicates the average across the 30 different sessions. (a) No physics-based regularization, (b) simple addition, (c) sigmoid, and (d) $tan^{-1}$.
  • Figure 2: The average $\text{MSE}_{\sigma}$ and $\text{MSE}_{equil}$ for all 30 training sessions for each method.
  • Figure 3: Top row: Bootstrap analysis to measure the performance variation of average $\text{MSE}_{\sigma}$, $\text{MSE}_{equil}$, and convergence iteration as a function of number of training sessions. 10,000 samples were taken for each number of training sessions (sample size). Bottom row: the derivative of the curves plotted in the top row (corresponding column-wise). A line at 0 is plotted in the derivative plots to estimate when the derivative converges to 0. Each row corresponds to the legends on the right for their respective row.
  • Figure 4:
  • Figure 5: The absolute value of a divergence field ($K_2$ from Equation 3 in Ref. lenau2024importance is shown here) for the best, median, and worst performing training session for each method. The divergence fields are scaled to the target divergence fields' minimum and maximum values. Yellow pixels indicate a value greater than or equal to the target's largest deviation from equilibrium.
  • ...and 8 more figures