Table of Contents
Fetching ...

Learning Long-Range Representations with Equivariant Messages

Egor Rumiantsev, Marcel F. Langer, Tulga-Erdene Sodjargal, Michele Ceriotti, Philip Loche

TL;DR

This work proposes the use of equivariant, rather than scalar, charges for long-range interactions, and designs a graph neural network architecture, LOREM, around this long-range message passing mechanism, with excellent benchmark performance.

Abstract

Machine learning interatomic potentials trained on first-principles reference data are becoming valuable tools for computational physics, biology, and chemistry. Equivariant message-passing neural networks, including transformers, achieve state-of-the-art accuracy but rely on cutoff-based graphs, limiting their ability to capture long-range effects such as electrostatics or dispersion, as well as electron delocalization. While long-range correction schemes based on inverse power laws of interatomic distances have been proposed, they are unable to communicate higher-order geometric information and are thus limited in applicability. To address this shortcoming, we propose the use of equivariant, rather than scalar, charges for long-range interactions, and design a graph neural network architecture, LOREM, around this long-range message passing mechanism. We consider several datasets specifically designed to highlight non-local physical effects, and compare short-range message passing with different receptive fields to invariant and equivariant long-range message passing. Even though most approaches work for careful dataset-specific choices of their model hyperparameters, LOREM works consistently without such changes, with excellent benchmark performance.

Learning Long-Range Representations with Equivariant Messages

TL;DR

This work proposes the use of equivariant, rather than scalar, charges for long-range interactions, and designs a graph neural network architecture, LOREM, around this long-range message passing mechanism, with excellent benchmark performance.

Abstract

Machine learning interatomic potentials trained on first-principles reference data are becoming valuable tools for computational physics, biology, and chemistry. Equivariant message-passing neural networks, including transformers, achieve state-of-the-art accuracy but rely on cutoff-based graphs, limiting their ability to capture long-range effects such as electrostatics or dispersion, as well as electron delocalization. While long-range correction schemes based on inverse power laws of interatomic distances have been proposed, they are unable to communicate higher-order geometric information and are thus limited in applicability. To address this shortcoming, we propose the use of equivariant, rather than scalar, charges for long-range interactions, and design a graph neural network architecture, LOREM, around this long-range message passing mechanism. We consider several datasets specifically designed to highlight non-local physical effects, and compare short-range message passing with different receptive fields to invariant and equivariant long-range message passing. Even though most approaches work for careful dataset-specific choices of their model hyperparameters, LOREM works consistently without such changes, with excellent benchmark performance.

Paper Structure

This paper contains 60 sections, 8 equations, 11 figures, 13 tables.

Figures (11)

  • Figure 1: Sketch of the Lorem architecture.
  • Figure 2: ($\mathbf{A}$) Au_2 dimer on MgO surface, showing both wetting and non-wetting geometries, as well as the Al dopant. ($\mathbf{B}$) Energy over distance $d$ for the non-wetting geometry for the doped and undoped surface. The minima are indicated with a diamond symbol; the reference energy curve is drawn in grey. Offsets are added to distinguish the curves and the value at the minimum is subtracted. ($\mathbf{C}$) Na_9Cl_8^+ (top) and Na_8Cl_8^+ (bottom) cluster, the moving atom is marked with transparent copies of itself, and the distance of interest is labeled with $d$. ($\mathbf{D}$) Energy over distance for both clusters.
  • Figure 3: ($\mathbf{A}$) Illustration of cumulene laid flat, indicating relevant distances between atoms. ($\mathbf{B}$) Energy profile over a 90° rotation of one rotor. The minimum value of each curve is subtracted before plotting. The inset shows a 3D representation of cumulene, defining the dihedral angle $\theta$.
  • Figure 4: ($\mathbf{A}$) Charge-charge pair from the biodimers dataset. ($\mathbf{B}$) Mean absolute error on forces for different models on the different dimer classes: Apolar-apolar (AA), charge-apolar (CA), charge-charge (CC), charge-polar (CP), polar-apolar (PA), polar-polar (PP). Note that the vertical axis has been split at 2.75meV.
  • Figure 5: Energy over the reaction coordinate for the nucleophilic substitution reaction $\ch{Cl^-} + \ch{H_3C-Br} \rightarrow \ch{Cl-CH_3} + \ch{Br^-}$; snapshots are shown as insets.
  • ...and 6 more figures