Table of Contents
Fetching ...

Visualizing the loss landscapes of physics-informed neural networks

Conor Rowan, Finn Murphy-Blanchard

TL;DR

The purpose of this work is to introduce the loss landscape perspective to the scientific machine learning community, compare the Deep Ritz and the strong form losses, and to challenge prevailing intuitions about the complexity of the loss landscapes of physics-informed networks.

Abstract

Training a neural network requires navigating a high-dimensional, non-convex loss surface to find parameters that minimize this loss. In many ways, it is surprising that optimizers such as stochastic gradient descent and ADAM can reliably locate minima which perform well on both the training and test data. To understand the success of training, a "loss landscape" community has emerged to study the geometry of the loss function and the dynamics of optimization, often using visualization techniques. However, these loss landscape studies have mostly been limited to machine learning for image classification. In the newer field of physics-informed machine learning, little work has been conducted to visualize the landscapes of losses defined not by regression to large data sets, but by differential operators acting on state fields discretized by neural networks. In this work, we provide a comprehensive review of the loss landscape literature, as well as a discussion of the few existing physics-informed works which investigate the loss landscape. We then use a number of the techniques we survey to empirically investigate the landscapes defined by the Deep Ritz and squared residual forms of the physics loss function. We find that the loss landscapes of physics-informed neural networks have many of the same properties as the data-driven classification problems studied in the literature. Unexpectedly, we find that the two formulations of the physics loss often give rise to similar landscapes, which appear smooth, well-conditioned, and convex in the vicinity of the solution. The purpose of this work is to introduce the loss landscape perspective to the scientific machine learning community, compare the Deep Ritz and the strong form losses, and to challenge prevailing intuitions about the complexity of the loss landscapes of physics-informed networks.

Visualizing the loss landscapes of physics-informed neural networks

TL;DR

The purpose of this work is to introduce the loss landscape perspective to the scientific machine learning community, compare the Deep Ritz and the strong form losses, and to challenge prevailing intuitions about the complexity of the loss landscapes of physics-informed networks.

Abstract

Training a neural network requires navigating a high-dimensional, non-convex loss surface to find parameters that minimize this loss. In many ways, it is surprising that optimizers such as stochastic gradient descent and ADAM can reliably locate minima which perform well on both the training and test data. To understand the success of training, a "loss landscape" community has emerged to study the geometry of the loss function and the dynamics of optimization, often using visualization techniques. However, these loss landscape studies have mostly been limited to machine learning for image classification. In the newer field of physics-informed machine learning, little work has been conducted to visualize the landscapes of losses defined not by regression to large data sets, but by differential operators acting on state fields discretized by neural networks. In this work, we provide a comprehensive review of the loss landscape literature, as well as a discussion of the few existing physics-informed works which investigate the loss landscape. We then use a number of the techniques we survey to empirically investigate the landscapes defined by the Deep Ritz and squared residual forms of the physics loss function. We find that the loss landscapes of physics-informed neural networks have many of the same properties as the data-driven classification problems studied in the literature. Unexpectedly, we find that the two formulations of the physics loss often give rise to similar landscapes, which appear smooth, well-conditioned, and convex in the vicinity of the solution. The purpose of this work is to introduce the loss landscape perspective to the scientific machine learning community, compare the Deep Ritz and the strong form losses, and to challenge prevailing intuitions about the complexity of the loss landscapes of physics-informed networks.
Paper Structure (29 sections, 30 equations, 37 figures)

This paper contains 29 sections, 30 equations, 37 figures.

Figures (37)

  • Figure 1: Exploring the MLI property of the Deep Ritz and PINNs loss landscapes. Out of the $60$ trials, every one shows a monotonic decrease of the loss function along the straight-line path connecting the initial and final parameters.
  • Figure 2: Compared to the norm of the initial parameters $\boldsymbol \theta_f$, the Hessian walk travels appreciable distances in parameter space without leaving the isocontour of the two loss functions.
  • Figure 3: For both DRM and PINN, the two solutions are separated by a high loss barrier with a linear connection, but a quadratic Bezier path can be found over which the loss does not increase. This shows that though the optimizer does not find the same solution for different initializations, the two solutions lie in the same basin.
  • Figure 4: DRM loss landscape around randomly initialized parameters $\boldsymbol \theta_i$ in $9$ combinations of random directions.
  • Figure 5: PINN loss landscape around the same randomly initialized parameters as in DRM in $9$ combinations of the same random directions. A saddle in the fifth plot demonstrates the non-convexity of the loss landscape, though the loss surface is neither noisy nor ill-conditioned.
  • ...and 32 more figures