Table of Contents
Fetching ...

Observations on Recurrent Loss in the Neural Network Model of a Partial Differential Equation: the Advection-Diffusion Equation

Jonah A. Reeger

TL;DR

The paper addresses the stability gap in ML-based PDE solvers by constructing a recurrent network that exactly emulates a multistep Adams-Bashforth time integrator together with a collocation-based spatial operator for the advection-diffusion equation. It treats a single step as a linear neural network and trains a recurrent loss across multiple time steps to adapt the discrete operator so that the resulting method remains stable, leveraging classical spectral stability analysis. Key findings show that stable solutions can be found even when standard discretizations fail, but the results are highly sensitive to multiple hyperparameters and often unpredictable. This work demonstrates a data-driven route toward stabilizing linear PDE solvers and motivates further research into robust optimization strategies, initializations, and architectures to generalize stability guarantees.

Abstract

A growing body of literature has been leveraging techniques of machine learning (ML) to build novel approaches to approximating the solutions to partial differential equations. Noticeably absent from the literature is a systematic exploration of the stability of the solutions generated by these ML approaches. Here, a recurrent network is introduced that matches precisely the evaluation of a multistep method paired with a collocation method for approximating spatial derivatives in the advection diffusion equation. This allows for two things: 1) the use of traditional tools for analyzing the stability of a numerical method for solving PDEs and 2) bringing to bear efficient techniques of ML for the training of approximations for the action of (spatial) linear operators. Observations on impacts of varying the large number of parameters in even this simple linear problem are presented. Further, it is demonstrated that stable solutions can be found even where traditional numerical methods may fail.

Observations on Recurrent Loss in the Neural Network Model of a Partial Differential Equation: the Advection-Diffusion Equation

TL;DR

The paper addresses the stability gap in ML-based PDE solvers by constructing a recurrent network that exactly emulates a multistep Adams-Bashforth time integrator together with a collocation-based spatial operator for the advection-diffusion equation. It treats a single step as a linear neural network and trains a recurrent loss across multiple time steps to adapt the discrete operator so that the resulting method remains stable, leveraging classical spectral stability analysis. Key findings show that stable solutions can be found even when standard discretizations fail, but the results are highly sensitive to multiple hyperparameters and often unpredictable. This work demonstrates a data-driven route toward stabilizing linear PDE solvers and motivates further research into robust optimization strategies, initializations, and architectures to generalize stability guarantees.

Abstract

A growing body of literature has been leveraging techniques of machine learning (ML) to build novel approaches to approximating the solutions to partial differential equations. Noticeably absent from the literature is a systematic exploration of the stability of the solutions generated by these ML approaches. Here, a recurrent network is introduced that matches precisely the evaluation of a multistep method paired with a collocation method for approximating spatial derivatives in the advection diffusion equation. This allows for two things: 1) the use of traditional tools for analyzing the stability of a numerical method for solving PDEs and 2) bringing to bear efficient techniques of ML for the training of approximations for the action of (spatial) linear operators. Observations on impacts of varying the large number of parameters in even this simple linear problem are presented. Further, it is demonstrated that stable solutions can be found even where traditional numerical methods may fail.

Paper Structure

This paper contains 11 sections, 58 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: A simplified schematic of a single step of an $s$-step multistep method for a semi-discrete PDE cast as a network.
  • Figure 2: Eigenvalues of $D'$ for various values of $\nu$ with $h_{t}=h_{t,N,\nu,s}$. The value of $h_{t,N,\nu,s}$ is indicated in parentheses for each choice of $N$, $\nu$ and $s$. The stability region (shaded) of AB-$s$ is shown with its stability boundary outlined by a solid curve. Markers indicate the locations of the scaled eigenvalues. Notice that these choices of $h_{t}$ are (approximately) the largest such that the scaled eigenvalues fall inside the stability region.
  • Figure 3: Left: modulus of $b_{\eta}(0)$ for the bump function \ref{['eq:bumpfunction']}. Right: log base 10 of the error in the approximation \ref{['eq:partialsum']} of \ref{['eq:bumpfunction']}.
  • Figure 4: Left column: log base 10 of the infinity norm absolute forward error in the numerical solution computed using the weights generated by minimizing $J$. The colors indicate varying values of $Q$ and the curve styles depict different values of $T$. The remaining parameter values are given in the figure title. Right column: eigenvalues of $D$, constructed from these weights, with markers this time indicating different values of $T$. Rows correspond to different values of $\kappa_{\hbox{max}}$, increasing from top to bottom. In all of the rows, the black curves in the left column and the black dots in the right column correspond to the absolute forward error and eigenvalues, respectively, of the second order centered difference method (with corresponding matrix $D'$). Parameter choices are indicated in the title of the plot. Notice that increases in $Q$, the number of time steps included in the recurrent loss function, do not predictably improve stability. Similarly, performance relative to increases in $T$, the number of training samples, is erratic.
  • Figure 5: See figure \ref{['fig:ErrorandEigenvalues_s_2_n_9_neglog10nu_Inf_N_101_htmult_1_10_p_2']} for a discussion of what is depicted in each subplot. Parameter choices are indicated in the title of the plot. Notice that a systematic method choosing $\kappa_{\hbox{max}}$ to produce weights for a stable numerical method may not exist.
  • ...and 3 more figures