Separable Hamiltonian Neural Networks

Zi-Yu Khoo; Dawen Wu; Jonathan Sze Choong Low; Stéphane Bressan

Separable Hamiltonian Neural Networks

Zi-Yu Khoo, Dawen Wu, Jonathan Sze Choong Low, Stéphane Bressan

TL;DR

This work tackles the challenge of learning Hamiltonian dynamics by enforcing additive separability through separable Hamiltonian neural networks (HNNs). It introduces three bias modalities—observational, learning, and inductive—to embed $H(q,p)=T(q)+V(p)$ and zero mixed partial, improving both Hamiltonian and vector-field regression and enhancing energy conservation. Empirical results across multiple separable systems show all separable variants outperform the baseline HNN, with the HNN that combines observational and inductive biases (HNN-OI) delivering the best overall performance and interpretability, even in chaotic and high-dimensional settings. The approach demonstrates how injecting physical priors into machine learning models can yield more accurate and physically faithful dynamics with meaningful decomposition into kinetic and potential energy terms.

Abstract

Hamiltonian neural networks (HNNs) are state-of-the-art models that regress the vector field of a dynamical system under the learning bias of Hamilton's equations. A recent observation is that embedding a bias regarding the additive separability of the Hamiltonian reduces the regression complexity and improves regression performance. We propose separable HNNs that embed additive separability within HNNs using observational, learning, and inductive biases. We show that the proposed models are more effective than the HNN at regressing the Hamiltonian and the vector field. Consequently, the proposed models predict the dynamics and conserve the total energy of the Hamiltonian system more accurately.

Separable Hamiltonian Neural Networks

TL;DR

and zero mixed partial, improving both Hamiltonian and vector-field regression and enhancing energy conservation. Empirical results across multiple separable systems show all separable variants outperform the baseline HNN, with the HNN that combines observational and inductive biases (HNN-OI) delivering the best overall performance and interpretability, even in chaotic and high-dimensional settings. The approach demonstrates how injecting physical priors into machine learning models can yield more accurate and physically faithful dynamics with meaningful decomposition into kinetic and potential energy terms.

Abstract

Paper Structure (20 sections, 18 equations, 14 figures, 2 tables)

This paper contains 20 sections, 18 equations, 14 figures, 2 tables.

Introduction
Background
Hamiltonian Systems
Separable Hamiltonian Systems
Methodology
The HNN Baseline
The HNN with Observational Bias (HNN-O)
The HNN with Learning Bias (HNN-L)
The HNN with Inductive Bias (HNN-I)
The HNN with Multiple Biases
Experiments for General Systems
Hyperparameter Tuning
Accuracy
Efficiency
Experiments for Challenging Systems
...and 5 more sections

Figures (14)

Figure 1: Left: Nonlinear pendulum vector field (black arrows) and Hamiltonian (heatmap). Right: Random samples $(q, p, \dot{q}, \dot{p})$ of the nonlinear pendulum vector field.
Figure 2: Leftmost: Architecture of the baseline HNN. Second from left: Proposed separable HNN with observational bias. Second from right: Proposed separable HNN with learning bias. Rightmost: Proposed separable HNN with inductive bias.
Figure 3: Left: 2 new samples (in red) are created from 2 old samples (in blue). Right: 12 new samples are created from 4 samples. Generally, quadratically more new samples are created.
Figure 4: Performance of the HNN-O with $\mu\in \{1,2,3,4,5,10,20,30,40,50\}$ for the Nonlinear Pendulum (red, solid line), Anisotropic Oscillator (yellow, densely dotted line), Henon Heiles (green, loosely dashed line), Toda Lattice (blue, densely dashed line), and Coupled Oscillator (purple, loosely dotted line) systems. Left: $E_H$ and standard error. Center: $E_V$ and standard error. Right: Wall-clock time taken until convergence in seconds and standard error.
Figure 5: Performance of the HNN-L with $c_3\in \{0.25, 0.50, 1.00, 2.00, 4.00\}$ for the Nonlinear Pendulum (red, solid line), Anisotropic Oscillator (yellow, densely dotted line), Henon Heiles (green, loosely dashed line), Toda Lattice (blue, densely dashed line), and Coupled Oscillator (purple, loosely dotted line) systems. Left: $E_H$ and standard error. Center: $E_V$ and standard error. Right: Wall-clock time taken until convergence in seconds and standard error.
...and 9 more figures

Theorems & Definitions (1)

Definition 1: Additive Separability

Separable Hamiltonian Neural Networks

TL;DR

Abstract

Separable Hamiltonian Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (14)

Theorems & Definitions (1)