Smooth, exact rotational symmetrization for deep learning on point clouds

Sergey N. Pozdnyakov; Michele Ceriotti

Smooth, exact rotational symmetrization for deep learning on point clouds

Sergey N. Pozdnyakov, Michele Ceriotti

TL;DR

This work introduces ECSE, a general, a posteriori protocol to enforce exact rotational equivariance on any point-cloud backbone without compromising translation or permutation invariances or smoothness. As a flagship demonstration, the authors present the Point Edge Transformer (PET), an edge-focused transformer that benefits from ECSE to achieve state-of-the-art results across diverse atomistic datasets, including liquids, molecules, and solids, while also handling covariant outputs. The key contributions are the ECSE framework, adaptive cutoff and weighting schemes to ensure smooth, continuous predictions, and the empirical validation showing that exact rotational symmetry can be achieved with minimal or even favorable changes in accuracy. The work broadens the applicability of generic point-cloud models to physics-aware atomistic simulations and suggests that exact symmetry constraints can be relaxed during design without sacrificing physical fidelity, enabling broader cross-domain transfer of techniques.

Abstract

Point clouds are versatile representations of 3D objects and have found widespread application in science and engineering. Many successful deep-learning models have been proposed that use them as input. The domain of chemical and materials modeling is especially challenging because exact compliance with physical constraints is highly desirable for a model to be usable in practice. These constraints include smoothness and invariance with respect to translations, rotations, and permutations of identical atoms. If these requirements are not rigorously fulfilled, atomistic simulations might lead to absurd outcomes even if the model has excellent accuracy. Consequently, dedicated architectures, which achieve invariance by restricting their design space, have been developed. General-purpose point-cloud models are more varied but often disregard rotational symmetry. We propose a general symmetrization method that adds rotational equivariance to any given model while preserving all the other requirements. Our approach simplifies the development of better atomic-scale machine-learning schemes by relaxing the constraints on the design space and making it possible to incorporate ideas that proved effective in other domains. We demonstrate this idea by introducing the Point Edge Transformer (PET) architecture, which is not intrinsically equivariant but achieves state-of-the-art performance on several benchmark datasets of molecules and solids. A-posteriori application of our general protocol makes PET exactly equivariant, with minimal changes to its accuracy.

Smooth, exact rotational symmetrization for deep learning on point clouds

TL;DR

Abstract

Paper Structure (39 sections, 33 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 39 sections, 33 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Equivariant models and atomic-scale applications
Models for generic point clouds
Everything but rotational equivariance
Equivariant Coordinate System Ensemble
Point Edge Transformer
Benchmarks
Discussion
Acknowledgements
Point Edge Transformer
Details of the training protocol
Detailed benchmark results
COLL
HME21
MnO
...and 24 more sections

Figures (8)

Figure 1: (a) Equivariant coordinate-system ensemble: Each ordered pair of neighbors defines a local coordinate system. Next, an atomic environment is projected on all of them (which is equivalent to rotation) and used as input for a backbone architecture. If outputs are covariant, such as vectors, they are rotated back to the initial coordinate system. Finally, predictions are averaged over. (b) Discontinuities related to plain average. The weighted average with weights $w_{jj'}$ resolves these problems. (c) Cutoff functions ${f_\text{c}}$ and ${q_\text{c}}$ used to define weights $w_{jj'}$ (d) To reduce the computational cost, an adaptive cutoff ${R_\text{in}}$ is used, which adjusts to a given geometry instead of being a global user-specified constant.
Figure 2: Architecture of the Point-Edge Transformer (PET). White and colored boxes represent layers; gray boxes and lines represent data. (a) PET is a message-passing architecture. At each of the $n_{\text{GNN}}$ message-passing (MP) interactions, messages are communicated between all the pairs of atoms closer than a certain cutoff distance $R_c$. At each stage, the corresponding MP block computes output messages and predictions of the target property given the input messages and geometry of the point cloud. (b) For each atom in the system, we define atom-centered environment $A_i$ as a collection of all the neighbors within the cutoff distance $R_c$. The MP block is applied to each such atomic environment. Given 1) the geometry of the atomic environment, 2) the chemical species of the atoms, and 3) input messages from all the neighbors to the central atom it produces output messages from the central atom to all the neighbors and contribution to the prediction of the target property. The first step is to encode all the information associated with each neighbor to an abstract token of dimensionality $d_{\text{PET}}$. Next, the collection of such tokens (with the one associated with the central atom) is fed into the transformer with $n_{\text{TL}}$ self-attention layers. The transformer does permutationally covariant transformation. Thus, the association between the tokens and neighbors is preserved. Therefore, we can simply treat output tokens as output messages to the corresponding neighbors. (c) The Encoder layer first maps all the sources of information into dimensionality $d_{\text{PET}}$. Next, all 3 tokens are concatenated and compressed into a single one of the desired size.
Figure 3: (a-c) Accuracy of PET potentials ($y_0$) of liquid water, compared with NEQUIPbazt+22ncomm. (a) Accuracy for different numbers of message-passing blocks $n_\text{GNN}$ and transformer layers $n_\text{TL}$; (b) Accuracy as a function of $n_\text{GNN}$, for constant $n_\text{GNN}\times n_\text{TL}=12$.; (c) Accuracy as a function of cutoff. (d-f) Learning curves for different molecular data sets, comparing symmetrized PET models ($y_\text{S}$) with several previous worksnigam2020recursivebigi2022smoothpozd+20prlbart+13prbbigi+23arxivveit+20jcpfabe+18jcpzhan+22jcp, including the current state of the art. (d) Random CH$_4$ dataset, training only on energies; (e) Random CH$_4$ dataset, training on energies and forces; (f) Vectorial dipole moments in the QM9 datasetveit+20jcp.
Figure 4: Comparison of a standard StepLR learning rate schedule and the strategy we use for the HME21 and water datasets.
Figure 5: Smooth cutoff functions $f_c$, $q_c^1$ and $q_c^2$.
...and 3 more figures

Smooth, exact rotational symmetrization for deep learning on point clouds

TL;DR

Abstract

Smooth, exact rotational symmetrization for deep learning on point clouds

Authors

TL;DR

Abstract

Table of Contents

Figures (8)