Table of Contents
Fetching ...

Pseudo-Hamiltonian neural networks for learning partial differential equations

Sølve Eidnes, Kjetil Olsen Lye

TL;DR

This work extends PHNNs to PDEs by casting dynamics in a pseudo-Hamiltonian form $u_t = (S - R) \nabla H + f$ with learned, neuron-based representations of the energy, dissipation, and forcing components, coupled with convolution-based spatial operators. The method yields two modular PHNN variants (general and informed) and a baseline for comparison, showing superior predictive accuracy and interpretability across multiple PDEs (KdV, KdV--Burgers, BBM, Perona--Malik, Cahn--Hilliard) while enabling post-training removal of disturbances. The paper also analyzes stability, discretization effects, kernel-size sensitivity, and convergence in idealized limits, and discusses future extensions to higher dimensions and alternative pseudo-Hamiltonian formulations. Overall, PHNNs provide a grey-box framework that preserves energy-like quantities while accommodating dissipation and external forcing, with practical implications for physics-informed modeling and image processing tasks.

Abstract

Pseudo-Hamiltonian neural networks (PHNN) were recently introduced for learning dynamical systems that can be modelled by ordinary differential equations. In this paper, we extend the method to partial differential equations. The resulting model is comprised of up to three neural networks, modelling terms representing conservation, dissipation and external forces, and discrete convolution operators that can either be learned or be given as input. We demonstrate numerically the superior performance of PHNN compared to a baseline model that models the full dynamics by a single neural network. Moreover, since the PHNN model consists of three parts with different physical interpretations, these can be studied separately to gain insight into the system, and the learned model is applicable also if external forces are removed or changed.

Pseudo-Hamiltonian neural networks for learning partial differential equations

TL;DR

This work extends PHNNs to PDEs by casting dynamics in a pseudo-Hamiltonian form with learned, neuron-based representations of the energy, dissipation, and forcing components, coupled with convolution-based spatial operators. The method yields two modular PHNN variants (general and informed) and a baseline for comparison, showing superior predictive accuracy and interpretability across multiple PDEs (KdV, KdV--Burgers, BBM, Perona--Malik, Cahn--Hilliard) while enabling post-training removal of disturbances. The paper also analyzes stability, discretization effects, kernel-size sensitivity, and convergence in idealized limits, and discusses future extensions to higher dimensions and alternative pseudo-Hamiltonian formulations. Overall, PHNNs provide a grey-box framework that preserves energy-like quantities while accommodating dissipation and external forcing, with practical implications for physics-informed modeling and image processing tasks.

Abstract

Pseudo-Hamiltonian neural networks (PHNN) were recently introduced for learning dynamical systems that can be modelled by ordinary differential equations. In this paper, we extend the method to partial differential equations. The resulting model is comprised of up to three neural networks, modelling terms representing conservation, dissipation and external forces, and discrete convolution operators that can either be learned or be given as input. We demonstrate numerically the superior performance of PHNN compared to a baseline model that models the full dynamics by a single neural network. Moreover, since the PHNN model consists of three parts with different physical interpretations, these can be studied separately to gain insight into the system, and the learned model is applicable also if external forces are removed or changed.
Paper Structure (27 sections, 1 theorem, 63 equations, 18 figures, 5 tables, 2 algorithms)

This paper contains 27 sections, 1 theorem, 63 equations, 18 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

Let $\Delta t>0$, and $g, \tilde{g}:\mathbb{R}^M\to \mathbb{R}^M$. Assume that $u:[0, T)\to \mathbb{R}^M$ solves eq:modelode and that $\tilde{u}^1, \ldots, \tilde{u}^N\in \mathbb{R}^M$ obeyThis is essentially saying they are obtained during training for the loss function. Then,

Figures (18)

  • Figure 1: Predictions of the KdV equation \ref{['eq:kdv']} by two PHNN models and our baseline model, compared to DGNet Matsubara2020deep and PDE-FIND Rudy2017data. The training data consist of 420 states, with 20 different initial conditions and $21$ points equidistributed in time from $t = 0$ to $t = 0.2$, and the neural network models are all trained for 5000 epochs.
  • Figure 2: Predictions of the forced KdV--Burgers system \ref{['eq:forcedkdvburgers']} with force \ref{['eq:kdvf']}, obtained from the best of 10 models of each model type, after being trained for 5000 epochs, as evaluated by the mean MSE at $t=2$ on predictions from 10 random initial states.
  • Figure 3: Solutions of the various models, after being trained for 5000 epochs, of the forced KdV--Burgers system \ref{['eq:forcedkdvburgers']} with $f$ given by \ref{['eq:kdvf']} at time $t=2$. The line and the shaded area are the mean resp. standard deviation of 10 models of each type. The dashed black line is the ground truth. Upper row: The original system \ref{['eq:forcedkdvburgers']} that the models are trained on. Second row: The learned force approximating $f$ in \ref{['eq:forcedkdvburgers']}. Third row: Predictions with the force $f$ removed from the models. Lower row: Predictions with the external force and the dissipation term removed from the models.
  • Figure 4: Predictions of the forced KdV--Burgers system \ref{['eq:forcedkdvburgers']} with force \ref{['eq:kdvf']}, obtained from the best of 10 models of each model type, as evaluated by the mean MSE at $t=2$ on predictions from 10 random initial states.
  • Figure 5: Solutions of the various learned models of the forced KdV--Burgers system \ref{['eq:forcedkdvburgers']} with $f$ given by \ref{['eq:kdvf']} at time $t=2$. The line and the shaded area, barely visible in these plots, are the mean resp. standard deviation of 10 models of each type. The dashed black line is the ground truth. Upper row: The original system \ref{['eq:forcedkdvburgers']} that the models are trained on. Second row: The learned force approximating $f$ in \ref{['eq:forcedkdvburgers']}. Third row: Predictions with the force $f$ removed from the models. Lower row: Predictions with the external force and the dissipation term removed from the models.
  • ...and 13 more figures

Theorems & Definitions (6)

  • Example 1
  • Example 2
  • Theorem 1
  • proof
  • Remark 1
  • Remark 2