Table of Contents
Fetching ...

Rapid training of Hamiltonian graph networks using random features

Atamert Rahma, Chinmay Datar, Ana Cukarska, Felix Dietrich

TL;DR

The paper tackles the slow, gradient-based training of physics-informed Hamiltonian graph networks for N-body dynamics. It introduces Random Feature Hamiltonian Graph Networks (RF-HGN), which replace iterative optimization with random feature-based dense layers and a linear least-squares readout, enabling gradient-descent-free training. By enforcing translation, rotation, and permutation invariances, RF-HGN demonstrates robust zero-shot generalization from small graphs to very large ones while maintaining energy-consistent dynamics. Across mass-spring, Lennard-Jones, and molecular dynamics benchmarks in up to 3D with thousands of particles, the method achieves comparable accuracy to state-of-the-art models but with dramatically faster training times, challenging the dominance of gradient-based optimization in physics-informed learning.

Abstract

Learning dynamical systems that respect physical symmetries and constraints remains a fundamental challenge in data-driven modeling. Integrating physical laws with graph neural networks facilitates principled modeling of complex N-body dynamics and yields accurate and permutation-invariant models. However, training graph neural networks with iterative, gradient-based optimization algorithms (e.g., Adam, RMSProp, LBFGS) often leads to slow training, especially for large, complex systems. In comparison to 15 different optimizers, we demonstrate that Hamiltonian Graph Networks (HGN) can be trained up to 600x faster--but with comparable accuracy--by replacing iterative optimization with random feature-based parameter construction. We show robust performance in diverse simulations, including N-body mass-spring and molecular systems in up to 3 dimensions and 10,000 particles with different geometries, while retaining essential physical invariances with respect to permutation, rotation, and translation. Our proposed approach is benchmarked using a NeurIPS 2022 Datasets and Benchmarks Track publication to further demonstrate its versatility. We reveal that even when trained on minimal 8-node systems, the model can generalize in a zero-shot manner to systems as large as 4096 nodes without retraining. Our work challenges the dominance of iterative gradient-descent-based optimization algorithms for training neural network models for physical systems.

Rapid training of Hamiltonian graph networks using random features

TL;DR

The paper tackles the slow, gradient-based training of physics-informed Hamiltonian graph networks for N-body dynamics. It introduces Random Feature Hamiltonian Graph Networks (RF-HGN), which replace iterative optimization with random feature-based dense layers and a linear least-squares readout, enabling gradient-descent-free training. By enforcing translation, rotation, and permutation invariances, RF-HGN demonstrates robust zero-shot generalization from small graphs to very large ones while maintaining energy-consistent dynamics. Across mass-spring, Lennard-Jones, and molecular dynamics benchmarks in up to 3D with thousands of particles, the method achieves comparable accuracy to state-of-the-art models but with dramatically faster training times, challenging the dominance of gradient-based optimization in physics-informed learning.

Abstract

Learning dynamical systems that respect physical symmetries and constraints remains a fundamental challenge in data-driven modeling. Integrating physical laws with graph neural networks facilitates principled modeling of complex N-body dynamics and yields accurate and permutation-invariant models. However, training graph neural networks with iterative, gradient-based optimization algorithms (e.g., Adam, RMSProp, LBFGS) often leads to slow training, especially for large, complex systems. In comparison to 15 different optimizers, we demonstrate that Hamiltonian Graph Networks (HGN) can be trained up to 600x faster--but with comparable accuracy--by replacing iterative optimization with random feature-based parameter construction. We show robust performance in diverse simulations, including N-body mass-spring and molecular systems in up to 3 dimensions and 10,000 particles with different geometries, while retaining essential physical invariances with respect to permutation, rotation, and translation. Our proposed approach is benchmarked using a NeurIPS 2022 Datasets and Benchmarks Track publication to further demonstrate its versatility. We reveal that even when trained on minimal 8-node systems, the model can generalize in a zero-shot manner to systems as large as 4096 nodes without retraining. Our work challenges the dominance of iterative gradient-descent-based optimization algorithms for training neural network models for physical systems.

Paper Structure

This paper contains 44 sections, 12 equations, 16 figures, 29 tables, 1 algorithm.

Figures (16)

  • Figure 1: We propose an efficient training method for Hamiltonian graph networks using random feature sampling and linear solvers (left, also see \ref{['fig:swim-hgnn-architecture']}). The HGN captures ground truth dynamics of physical systems (shown: chain of 10 nodes, trained on 5) and trains up to 600× faster than State-Of-The-Art (SOTA) optimizers.
  • Figure 2: Illustration of train and test N-body system positions showcasing the RF-HGN’s translation- and rotation-invariance, and its zero-shot generalization capability, validated by conserved Hamiltonian and low trajectory prediction errors for the test data (see \ref{['app_fig2_details']} for details).
  • Figure 3: Random-feature Hamiltonian graph neural network architecture. Left (green box): Construction of node and edge encodings $h_{src}^{V}$ and $h^E$ from translation and rotation invariant position $q$ and momenta $p$ representations of an N-body system. Right (orange box): Construction of a global encoding for the graph using message passing. In RF-HGN, dense layers (blue) are constructed with random features, and linear layer weights (red) are optimized by solving a linear problem.
  • Figure 4: Graphs considered in the experiments: (a) 3D lattice (nodes arranged on a 2D grid, moving in a 3D space - see \ref{['section:results:optim-study']} and \ref{['section:results:zero-shot']}), (b) an open chain (nodes moving in 2D space - see \ref{['section:results:zero-shot']}), (c) molecules interacting through Lennard-Jones potential (nodes moving in 2D space with dynamic edges - see section \ref{['section:results:zero-shot']}), and (d) 2D closed chain (nodes moving in 2D space - see \ref{['sec:benchmarking']}).
  • Figure 5: Illustration of accurate zero-shot generalization for 3D lattice (see \ref{['fig:experiments']} (a)): Training on smaller systems (left) enables accurate predictions (right) on extremely large test systems (middle).
  • ...and 11 more figures