Completeness of Atomic Structure Representations
Jigyasa Nigam, Sergey N. Pozdnyakov, Kevin K. Huguenin-Dumittan, Michele Ceriotti
TL;DR
The paper addresses the challenge of obtaining a complete, symmetry-adapted local representation for atomic environments, highlighting the incompleteness of common density-based descriptors at finite body orders. It introduces a finite, triplet-based descriptor built from relative coordinates of two tagged neighbors and a nonlinear encoder, achieving $O(3)$-invariant and permutation-invariant completeness with a controllable resolution. Completeness is proven for the local neighborhood and demonstrated on a deliberately constructed set of bispectrum-degenerate $B_8$ structures, where nonlinear triplet features distinguish degenerate pairs that linear triplet or bispectrum descriptors cannot, enabling universal approximators for local properties. The work also connects to ACE/NICE and MTP frameworks, showing how a low-order nonlinear triplet representation can achieve completeness without requiring arbitrarily high-order linear expansions, with practical improvements in accuracy and stability.
Abstract
In this paper, we address the challenge of obtaining a comprehensive and symmetric representation of point particle groups, such as atoms in a molecule, which is crucial in physics and theoretical chemistry. The problem has become even more important with the widespread adoption of machine-learning techniques in science, as it underpins the capacity of models to accurately reproduce physical relationships while being consistent with fundamental symmetries and conservation laws. However, some of the descriptors that are commonly used to represent point clouds -- most notably those based on discretized correlations of the neighbor density, that underpin most of the existing ML models of matter at the atomic scale -- are unable to distinguish between special arrangements of particles in three dimensions. This makes it impossible to machine learn their properties. Atom-density correlations are provably complete in the limit in which they simultaneously describe the mutual relationship between all atoms, which is impractical. We present a novel approach to construct descriptors of \emph{finite} correlations based on the relative arrangement of particle triplets, which can be employed to create symmetry-adapted models with universal approximation capabilities, which have the resolution of the neighbor discretization as the sole convergence parameter. Our strategy is demonstrated on a class of atomic arrangements that are specifically built to defy a broad class of conventional symmetric descriptors, showcasing its potential for addressing their limitations.
