Table of Contents
Fetching ...

Equivariant Graph Neural Operator for Modeling 3D Dynamics

Minkai Xu, Jiaqi Han, Aaron Lou, Jean Kossaifi, Arvind Ramanathan, Kamyar Azizzadenesheli, Jure Leskovec, Stefano Ermon, Anima Anandkumar

TL;DR

This work introduces EGNO, an SE(3)-equivariant neural operator that models 3D relational dynamics as continuous trajectories, rather than single-step predictions. By embedding temporal evolution in Fourier space through equivariant temporal convolutions and stacking them with a graph neural backbone, EGNO achieves state-of-the-art performance across N-body, motion capture, and molecular dynamics benchmarks, while supporting parallel decoding and discretization-free trajectory predictions. The approach delivers strong data efficiency, generalizes to different temporal resolutions, and is compatible with various backbones, including EGNN and EGHN. These capabilities enable accurate, scalable modeling of complex geometric dynamics with broad scientific and practical impact.

Abstract

Modeling the complex three-dimensional (3D) dynamics of relational systems is an important problem in the natural sciences, with applications ranging from molecular simulations to particle mechanics. Machine learning methods have achieved good success by learning graph neural networks to model spatial interactions. However, these approaches do not faithfully capture temporal correlations since they only model next-step predictions. In this work, we propose Equivariant Graph Neural Operator (EGNO), a novel and principled method that directly models dynamics as trajectories instead of just next-step prediction. Different from existing methods, EGNO explicitly learns the temporal evolution of 3D dynamics where we formulate the dynamics as a function over time and learn neural operators to approximate it. To capture the temporal correlations while keeping the intrinsic SE(3)-equivariance, we develop equivariant temporal convolutions parameterized in the Fourier space and build EGNO by stacking the Fourier layers over equivariant networks. EGNO is the first operator learning framework that is capable of modeling solution dynamics functions over time while retaining 3D equivariance. Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods, thanks to the equivariant temporal modeling. Our code is available at https://github.com/MinkaiXu/egno.

Equivariant Graph Neural Operator for Modeling 3D Dynamics

TL;DR

This work introduces EGNO, an SE(3)-equivariant neural operator that models 3D relational dynamics as continuous trajectories, rather than single-step predictions. By embedding temporal evolution in Fourier space through equivariant temporal convolutions and stacking them with a graph neural backbone, EGNO achieves state-of-the-art performance across N-body, motion capture, and molecular dynamics benchmarks, while supporting parallel decoding and discretization-free trajectory predictions. The approach delivers strong data efficiency, generalizes to different temporal resolutions, and is compatible with various backbones, including EGNN and EGHN. These capabilities enable accurate, scalable modeling of complex geometric dynamics with broad scientific and practical impact.

Abstract

Modeling the complex three-dimensional (3D) dynamics of relational systems is an important problem in the natural sciences, with applications ranging from molecular simulations to particle mechanics. Machine learning methods have achieved good success by learning graph neural networks to model spatial interactions. However, these approaches do not faithfully capture temporal correlations since they only model next-step predictions. In this work, we propose Equivariant Graph Neural Operator (EGNO), a novel and principled method that directly models dynamics as trajectories instead of just next-step prediction. Different from existing methods, EGNO explicitly learns the temporal evolution of 3D dynamics where we formulate the dynamics as a function over time and learn neural operators to approximate it. To capture the temporal correlations while keeping the intrinsic SE(3)-equivariance, we develop equivariant temporal convolutions parameterized in the Fourier space and build EGNO by stacking the Fourier layers over equivariant networks. EGNO is the first operator learning framework that is capable of modeling solution dynamics functions over time while retaining 3D equivariance. Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods, thanks to the equivariant temporal modeling. Our code is available at https://github.com/MinkaiXu/egno.
Paper Structure (34 sections, 3 theorems, 19 equations, 8 figures, 9 tables)

This paper contains 34 sections, 3 theorems, 19 equations, 8 figures, 9 tables.

Key Result

Theorem 4.1

By parameterizing the kernel function ${\mathcal{K}}_\theta$ with eq:method-temporal-conveq:method-nonlinear, we have that ${\mathcal{T}}_\theta$ is an SO(3)-equivariant operator, i.e., $({\mathcal{T}}_\theta ({\mathbf{R}} f)) (t) = ({\mathbf{R}}({\mathcal{T}}_\theta f)) (t)$.

Figures (8)

  • Figure 1: Illustration of EGNO. EGNO blocks (green) can be built with any EGNN layers (blue) and the proposed equivariant temporal convolution layers (yellow). Consider discretizing the time window $\Delta T$ into $P$ points $\{\Delta t_1, \dots, \Delta t_P\}$. Given a current state ${\mathcal{G}}^{(t)}$, we will first repeat its features by $P$ times, concatenate the repeated features with time embeddings, and feed them into $L$EGNO blocks. Within each block, the temporal layers operate on temporal and channel dimensions while the EGNN layers operate on node and channel dimensions. Finally, EGNO can predict future dynamics as a function $f_{\mathcal{G}}(t)$ and decode a trajectory of states $\{{\mathcal{G}}^{(t+\Delta t_p)}\}_{p=1}^P$ in parallel.
  • Figure 2: Ablation studies on the number of modes $I$ on N-body simulation and Mocap-Run datasets.
  • Figure 3: Qualitative results of zero-shot generalization towards discretization steps. Sub-figures indexed by $P$ or $2P$ have $P$ or $2P$ timesteps while sharing exactly the same initial conditions. Left: Motion Capture Run. Right: N-body simulation. Best viewed in color.
  • Figure 4: Illustration of EGHNO. Here, we omit the input state repetition and time embedding conditioning process in \ref{['fig:framework']} and concentrate on the model details.
  • Figure 5: Visualization of the trajectory generated by EGNO with uniform discretization on the N-body simulation dataset. The input is in cyan, the ground truth final snapshot is in red, and the predicted trajectory is in blue. The opacity changes as time elapses.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Theorem 4.1
  • Proposition 1.1
  • Lemma 1.2: Fourier Action Equivariance
  • proof
  • proof : Proof of Theorem \ref{['theorem:equivariance']}