Table of Contents
Fetching ...

Stable Port-Hamiltonian Neural Networks

Fabian J. Roth, Dominik K. Klein, Maximilian Kannapinn, Jan Peters, Oliver Weeger

TL;DR

This work addresses the fragility of purely data-driven dynamic models by embedding physical priors into learning via stable port-Hamiltonian neural networks (sPHNN). By enforcing a convex, positive-definite Hamiltonian and energy-dissipative structure, the approach guarantees global Lyapunov stability (global asymptotic stability when dissipation is present) without relying on projection-based constraints. Across spinning rigid body dynamics, cascaded tanks, thermal food-processing surrogates, and additive-manufacturing simulations, sPHNNs achieve superior stability, accuracy, and generalization from sparse data, with a learnable-equilibrium variant (-LM) capable of inferring the equilibrium location when unknown. The results demonstrate that energy-based priors enable safe, data-efficient surrogate modeling for multi-physics systems and provide interpretable components by separating conservative, dissipative, and input-driven dynamics.

Abstract

In recent years, nonlinear dynamic system identification using artificial neural networks has garnered attention due to its broad potential applications across science and engineering. However, purely data-driven approaches often struggle with extrapolation and may yield physically implausible forecasts. Furthermore, the learned dynamics can exhibit instabilities, making it difficult to apply such models safely and robustly. This article introduces stable port-Hamiltonian neural networks, a machine learning architecture that incorporates physical biases of energy conservation and dissipation while ensuring global Lyapunov stability of the learned dynamics. Through illustrative and real-world examples, we demonstrate that these strong inductive biases facilitate robust learning of stable dynamics from sparse data, while avoiding instability and surpassing purely data-driven approaches in accuracy and physically meaningful generalization. Furthermore, the model's applicability and potential for data-driven surrogate modeling are showcased on multi-physics simulation data.

Stable Port-Hamiltonian Neural Networks

TL;DR

This work addresses the fragility of purely data-driven dynamic models by embedding physical priors into learning via stable port-Hamiltonian neural networks (sPHNN). By enforcing a convex, positive-definite Hamiltonian and energy-dissipative structure, the approach guarantees global Lyapunov stability (global asymptotic stability when dissipation is present) without relying on projection-based constraints. Across spinning rigid body dynamics, cascaded tanks, thermal food-processing surrogates, and additive-manufacturing simulations, sPHNNs achieve superior stability, accuracy, and generalization from sparse data, with a learnable-equilibrium variant (-LM) capable of inferring the equilibrium location when unknown. The results demonstrate that energy-based priors enable safe, data-efficient surrogate modeling for multi-physics systems and provide interpretable components by separating conservative, dissipative, and input-driven dynamics.

Abstract

In recent years, nonlinear dynamic system identification using artificial neural networks has garnered attention due to its broad potential applications across science and engineering. However, purely data-driven approaches often struggle with extrapolation and may yield physically implausible forecasts. Furthermore, the learned dynamics can exhibit instabilities, making it difficult to apply such models safely and robustly. This article introduces stable port-Hamiltonian neural networks, a machine learning architecture that incorporates physical biases of energy conservation and dissipation while ensuring global Lyapunov stability of the learned dynamics. Through illustrative and real-world examples, we demonstrate that these strong inductive biases facilitate robust learning of stable dynamics from sparse data, while avoiding instability and surpassing purely data-driven approaches in accuracy and physically meaningful generalization. Furthermore, the model's applicability and potential for data-driven surrogate modeling are showcased on multi-physics simulation data.

Paper Structure

This paper contains 22 sections, 3 theorems, 23 equations, 10 figures, 3 tables.

Key Result

Theorem 3.1

Consider the eq:isphs_evolution in the unforced case $\boldsymbol{u}(t)=\boldsymbol{0}$: Suppose the Hamiltonian ${\mathcal{H}}(\boldsymbol{x})$ is convex, twice continuously differentiable, and fulfills: Then, ${\mathcal{H}}$ is a suitable Lyapunov function for showing stability of the equilibrium at $\boldsymbol{x}(t)=\boldsymbol{0}$, and all solutions are bounded. Furthermore, the equilibrium

Figures (10)

  • Figure 1: Spinning rigid body: Interquartile mean (lines) and range (shaded regions) of the energy $E$ computed from the predicted states. Left: Models without stability bias; Right: Models with stability bias.
  • Figure 2: sPHNN model architecture: \ref{['fig:sPHNN_architecture']} Computation graph of . The FICNN parameterizing $f$ is normalized to obtain ${\mathcal{H}}$ and the outputs of the FFNN for $\boldsymbol{J}$, $\boldsymbol{L}$, and $\boldsymbol{G}$ are reshaped to be skew-symmetric, lower triangular, and rectangular matrices, respectively. \ref{['fig:random_network']} Dynamics of a randomly initialized for $n=2$, showing the built-in interpretability through the separation into conservative and dissipative dynamics.
  • Figure 3: Cascaded tanks: \ref{['fig:cascaded_tanks_rmse']}RMSE of the models on training and test trajectory. \ref{['fig:cascaded_tanks_relaxation']}: Predictions for the extended test trajectory. At $t=\qty{4096}{\second}$, the pump is turned off. Lines correspond to the interquartile mean and shaded areas represent the interquartile range of the predictions from the 20.0 model instances.
  • Figure 4: Thermal food processing surrogate: \ref{['fig:chicken_data_rmse']} and \ref{['fig:chicken_data_rmse_per_num_dat']}RMSE evaluated on 15.0 test trajectories for various numbers of augmented dimensions $n_A$ and training trajectories $n_D$. \ref{['fig:chicken_data_best_predictions_relaxation']} Interquartile mean and range of the $T_A$ predictions for a custom test case. Top row: Varying number of augmented dimensions with fixed number $n_D=2.0$ of training trajectories. Bottom row: Varying number of training trajectories with fixed number $n_A=3.0$ of augmented dimensions. The time $t=\qty{1395}{\second}$ marks the length of the training trajectories.
  • Figure 5: Heat source field and temperature field predictions on the cuboid domain for a test case with $v=\qty{12.5}{\milli\metre\per\second}$, $Q=\qty{400}{\watt}$. The instances selected for this evaluation resulted in the median test error for the corresponding model type. Colors are clipped to remain in the legend's range.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Theorem 3.1
  • Lemma A.1
  • Theorem A.2
  • proof