Table of Contents
Fetching ...

Training Hamiltonian neural networks without backpropagation

Atamert Rahma, Chinmay Datar, Felix Dietrich

TL;DR

It is empirically show that data-driven sampling of the network parameters outperforms data-agnostic sampling or the traditional gradient-based iterative optimization of the network parameters when approximating functions with steep gradients or wide input domains.

Abstract

Neural networks that synergistically integrate data and physical laws offer great promise in modeling dynamical systems. However, iterative gradient-based optimization of network parameters is often computationally expensive and suffers from slow convergence. In this work, we present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems through data-agnostic and data-driven algorithms. We empirically show that data-driven sampling of the network parameters outperforms data-agnostic sampling or the traditional gradient-based iterative optimization of the network parameters when approximating functions with steep gradients or wide input domains. We demonstrate that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks using gradient-based iterative optimization and is more than four orders of magnitude accurate in chaotic examples, including the Hénon-Heiles system.

Training Hamiltonian neural networks without backpropagation

TL;DR

It is empirically show that data-driven sampling of the network parameters outperforms data-agnostic sampling or the traditional gradient-based iterative optimization of the network parameters when approximating functions with steep gradients or wide input domains.

Abstract

Neural networks that synergistically integrate data and physical laws offer great promise in modeling dynamical systems. However, iterative gradient-based optimization of network parameters is often computationally expensive and suffers from slow convergence. In this work, we present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems through data-agnostic and data-driven algorithms. We empirically show that data-driven sampling of the network parameters outperforms data-agnostic sampling or the traditional gradient-based iterative optimization of the network parameters when approximating functions with steep gradients or wide input domains. We demonstrate that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks using gradient-based iterative optimization and is more than four orders of magnitude accurate in chaotic examples, including the Hénon-Heiles system.

Paper Structure

This paper contains 20 sections, 18 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Approximate-SWIM (A-SWIM) algorithm: This figure illustrates the process of approximating a Hamiltonian system from data, including generalized "position" $q$ and "momentum" $p$ coordinates, along with their time derivatives $\dot{q}$ and $\dot{p}$. Left: The given Hamiltonian system. Center: Sampling hidden layer weights and biases $\{W_{l}, b_{l}\}_{l=1}^{L}$ in the unsupervised setting. Right: Resampling hidden layer parameters using the approximated function values $\widehat{\mathcal{H}}(q,p)$ obtained in stage two. Note that in steps two and three, the linear layer parameters $\{ W_{L+1}, b_{L+1} \}$ are optimized by solving a linear least-squares problem.
  • Figure 2: Single pendulum (with frequency parameter) approximation errors are plotted.
  • Figure B.3: Single pendulum approximation errors and training times for the larger domain are plotted. See \ref{['table:domain-params']} and \ref{['table:approx-params']} for domain and model parameters.
  • Figure B.4: Lotka-Volterra approximation errors are plotted. The target Hamiltonian in the left and center plots has an equilibrium near the zero-vector, whereas the target Hamiltonian in the right plot has an equilibrium around five in both dimensions. Domain and model parameters are set according to \ref{['table:domain-params']} and \ref{['table:approx-params']}
  • Figure B.5: Double pendulum approximation errors and training times are displayed. Network width was scaled and other model parameters were set as described in \ref{['table:chaotic-system-approx-params']}. Domain information is listed in \ref{['table:domain-params']}.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Definition A.1
  • Definition A.2
  • Definition A.3