Equivariant Spatio-Temporal Attentive Graph Networks to Simulate Physical Dynamics
Liming Wu, Zhichao Hou, Jirui Yuan, Yu Rong, Wenbing Huang
TL;DR
ESTAG addresses non-Markovian dynamics in physical systems by reformulating dynamics as a spatio-temporal prediction task and enforcing $E(3)$-equivariance. It introduces Equivariant Discrete Fourier Transform (EDFT) to extract frequency features, and combines Equivariant Spatial Module (ESM) with Equivariant Temporal Module (ETM) to perform iterative spatial and temporal message passing. The model demonstrates superior accuracy across molecular, protein, and macro-scale datasets, with ablations confirming the essential roles of EDFT, attention, and equivariance. This approach provides a principled, symmetry-preserving framework for simulating complex dynamics with unseen environmental factors, offering potential impact in molecular dynamics, protein folding, and robotics. Future work could integrate energy-conservation priors and multi-scale GNNs to handle larger systems and longer horizons.
Abstract
Learning to represent and simulate the dynamics of physical systems is a crucial yet challenging task. Existing equivariant Graph Neural Network (GNN) based methods have encapsulated the symmetry of physics, \emph{e.g.}, translations, rotations, etc, leading to better generalization ability. Nevertheless, their frame-to-frame formulation of the task overlooks the non-Markov property mainly incurred by unobserved dynamics in the environment. In this paper, we reformulate dynamics simulation as a spatio-temporal prediction task, by employing the trajectory in the past period to recover the Non-Markovian interactions. We propose Equivariant Spatio-Temporal Attentive Graph Networks (ESTAG), an equivariant version of spatio-temporal GNNs, to fulfill our purpose. At its core, we design a novel Equivariant Discrete Fourier Transform (EDFT) to extract periodic patterns from the history frames, and then construct an Equivariant Spatial Module (ESM) to accomplish spatial message passing, and an Equivariant Temporal Module (ETM) with the forward attention and equivariant pooling mechanisms to aggregate temporal message. We evaluate our model on three real datasets corresponding to the molecular-, protein- and macro-level. Experimental results verify the effectiveness of ESTAG compared to typical spatio-temporal GNNs and equivariant GNNs.
