Table of Contents
Fetching ...

A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

TL;DR

The paper surveys Geometric GNNs for 3D atomic systems, placing emphasis on symmetry-aware design to model molecules, proteins, and materials. It introduces a taxonomy into invariant, Cartesian-equivariant, spherical-equivariant, and unconstrained families, and details the pipeline from input graph construction to embeddings, interaction layers, and task-specific outputs. Key contributions include a rigorous treatment of Cartesian and spherical tensor formalisms, explicit discussion of tensor products and irreps, and a survey of datasets and applications ranging from property prediction to molecular dynamics and structure generation. The discussion also highlights open questions on when to bake in physics, how to construct graphs, and how to scale these models, pointing toward both theory-driven and data-driven pathways for future progress.

Abstract

Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations. In recent years, Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation. Their specificity lies in the inductive biases they leverage - such as physical symmetries and chemical properties - to learn informative representations of these geometric graphs. In this opinionated paper, we provide a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems. We cover fundamental background material and introduce a pedagogical taxonomy of Geometric GNN architectures: (1) invariant networks, (2) equivariant networks in Cartesian basis, (3) equivariant networks in spherical basis, and (4) unconstrained networks. Additionally, we outline key datasets and application areas and suggest future research directions. The objective of this work is to present a structured perspective on the field, making it accessible to newcomers and aiding practitioners in gaining an intuition for its mathematical abstractions.

A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

TL;DR

The paper surveys Geometric GNNs for 3D atomic systems, placing emphasis on symmetry-aware design to model molecules, proteins, and materials. It introduces a taxonomy into invariant, Cartesian-equivariant, spherical-equivariant, and unconstrained families, and details the pipeline from input graph construction to embeddings, interaction layers, and task-specific outputs. Key contributions include a rigorous treatment of Cartesian and spherical tensor formalisms, explicit discussion of tensor products and irreps, and a survey of datasets and applications ranging from property prediction to molecular dynamics and structure generation. The discussion also highlights open questions on when to bake in physics, how to construct graphs, and how to scale these models, pointing toward both theory-driven and data-driven pathways for future progress.

Abstract

Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations. In recent years, Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation. Their specificity lies in the inductive biases they leverage - such as physical symmetries and chemical properties - to learn informative representations of these geometric graphs. In this opinionated paper, we provide a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems. We cover fundamental background material and introduce a pedagogical taxonomy of Geometric GNN architectures: (1) invariant networks, (2) equivariant networks in Cartesian basis, (3) equivariant networks in spherical basis, and (4) unconstrained networks. Additionally, we outline key datasets and application areas and suggest future research directions. The objective of this work is to present a structured perspective on the field, making it accessible to newcomers and aiding practitioners in gaining an intuition for its mathematical abstractions.
Paper Structure (67 sections, 49 equations, 29 figures, 1 table)

This paper contains 67 sections, 49 equations, 29 figures, 1 table.

Figures (29)

  • Figure 1: Timeline of key Geometric GNNs for 3D atomic systems, characterised by the type of intermediate representations within layers. This survey presents a self-contained overview of Geometric GNN architectures and their applications in modeling 3D atomic systems.
  • Figure 2: Physical symmetries of 3D atomic systems. The ordering of atoms/nodes in the system is arbitrary. Additionally, global rotations or translations of the system in 3D Euclidean space will lead to an equivalent transformation of 3D coordinates and other geometric attributes. Global properties of the system such as the potential energy are invariant to both permutation and physical symmetries. Geometric GNNs explicitly account for both permutation symmetry and physical transformation behaviours when modeling 3D atomic systems, while standard GNNs solely account for permutations.
  • Figure 3: Graphs and Graph Neural Networks. (a) Graphs model a set of entities as nodes, with edges denoting relationships and structure among them. (b) GNNs build latent representations of graph data through message passing operations, where each node performs learnable feature aggregation from its local neighbourhood. (c) Stacking $L$ message passing layers enables GNNs to send and aggregate information from $L$-hop subgraphs around each node.
  • Figure 4: Geometric graphs and Euclidean symmetries. Geometric graphs embedded in 3D Euclidean space model systems with both geometry and relational structure, such as molecules and materials. The geometric attributes transform along with Euclidean transformations of the system: (1) The group of rotations $\text{SO}(3)$, or rotations and reflections $\text{O}(3)$, acts on the vector features $\vec{\mathbf{v}}$ and coordinates $\vec{\mathbf{x}}$; and (2) The translation group $\text{T}(3)$ acts on the coordinates $\vec{\mathbf{x}}$. Scalar features remain invariant to Euclidean transformations. Note that this setup generalises to multiple vector features $\vec{\mathbf{v}}$ or higher-order tensor type features.
  • Figure 5: Invariant, equivariant, and unconstrained functions. The output of $\mathscr{{G}}$-invariant functions remains unchanged regardless of transformations applied to the input. $\mathscr{{G}}$-equivariant functions, on the other hand, exhibit transformations in the output that are equivalent to the transformations in the input. Finally, $\mathscr{{G}}$-unconstrained functions do not have predictable or known transformations of the output when the input undergoes transformations.
  • ...and 24 more figures