Table of Contents
Fetching ...

Machine learning electronic structure and atomistic properties from the external potential

Jigyasa Nigam, Tess Smidt, Geneviève Dusson

TL;DR

This work proposes an operator-centered framework in which the external (nuclear) potential, expressed in an AO basis, serves as the model input, and builds hierarchical, body-ordered representations of atomic configurations that closely mirror the principles underlying several popular atom-centered descriptors.

Abstract

Electronic structure calculations remain a major bottleneck in atomistic simulations and, not surprisingly, have attracted significant attention in machine learning (ML). Most existing approaches learn a direct map from molecular geometries, typically represented as graphs or encoded local environments, to molecular properties or use ML as a surrogate for electronic structure theory by targeting quantities such as Fock or density matrices expressed in an atomic orbital (AO) basis. Inspired by the Hohenberg-Kohn theorem, in this work, we propose an operator-centered framework in which the external (nuclear) potential, expressed in an AO basis, serves as the model input. From this operator, we construct hierarchical, body-ordered representations of atomic configurations that closely mirror the principles underlying several popular atom-centered descriptors. At the same time, the matrix-valued nature of the external potential provides a natural connection to equivariant message-passing neural networks. In particular, we show that successive products of the external potential provide a scalable route to equivariant message passing and enable an efficient description of long-range effects. We demonstrate that this approach can be used to model molecular properties, such as energies and dipole moments, from the external potential, or learn effective operator-to-operator maps, including mappings to the Fock matrix and the reduced density matrix from which multiple molecular observables can be simultaneously derived.

Machine learning electronic structure and atomistic properties from the external potential

TL;DR

This work proposes an operator-centered framework in which the external (nuclear) potential, expressed in an AO basis, serves as the model input, and builds hierarchical, body-ordered representations of atomic configurations that closely mirror the principles underlying several popular atom-centered descriptors.

Abstract

Electronic structure calculations remain a major bottleneck in atomistic simulations and, not surprisingly, have attracted significant attention in machine learning (ML). Most existing approaches learn a direct map from molecular geometries, typically represented as graphs or encoded local environments, to molecular properties or use ML as a surrogate for electronic structure theory by targeting quantities such as Fock or density matrices expressed in an atomic orbital (AO) basis. Inspired by the Hohenberg-Kohn theorem, in this work, we propose an operator-centered framework in which the external (nuclear) potential, expressed in an AO basis, serves as the model input. From this operator, we construct hierarchical, body-ordered representations of atomic configurations that closely mirror the principles underlying several popular atom-centered descriptors. At the same time, the matrix-valued nature of the external potential provides a natural connection to equivariant message-passing neural networks. In particular, we show that successive products of the external potential provide a scalable route to equivariant message passing and enable an efficient description of long-range effects. We demonstrate that this approach can be used to model molecular properties, such as energies and dipole moments, from the external potential, or learn effective operator-to-operator maps, including mappings to the Fock matrix and the reduced density matrix from which multiple molecular observables can be simultaneously derived.
Paper Structure (19 sections, 21 equations, 10 figures)

This paper contains 19 sections, 21 equations, 10 figures.

Figures (10)

  • Figure 1: Schematic overview of different models considered in this work. For each molecular configuration, $A$, we aim to predict target properties whose reference values are computed using a quantum mechanical (QM) calculation (top red) in an AO basis set $\mathcal{B}_\mathbf{y}$. This calculation produces underlying Fock and density matrices (collectively denoted as $\mathbf{M}$), as well as $\mathbf{V}$ in the top row. $\mathbf{V}$ can serve as a physically meaningful, naturally symmetry-adapted representation from which target properties may be predicted Op2Prop (green). Unlike these property-specific models, Op2Op models (dark blue) can map $\mathbf{V}$ to $\mathbf{M}$ in the target $\mathcal{B}_\mathbf{y}$ basis, from which properties of interest may be derived through simple analytic operations. Instead of restricting $\mathbf{V}$ to the basis of reference calculation, we treat it as a tunable hyperparameter and instead use as input, $\mathbf{V}$ computed in a richer basis set $\mathcal{B}_\mathbf{V}$ (red dashed). Similarly, instead of supervising the AO matrices underlying the calculation directly, one can predict a compressed effective representation of the operator on a (usually smaller) basis set $\mathcal{B}_\mathbf{M}$, which learns an effective projection of the QM calculation on a reduced basis (cyan).
  • Figure 2: Decomposition of AO matrix representations of atomistic properties and electronic operators into symmetry-adapted blocks. For Op2Op models, both the input external potential $\mathbf{V}$ and output matrices $\mathbf{M}$ are decomposed into irreducible representations (irreps) of $O(3)$, indexed by angular momentum $\lambda$ and parity $\sigma$. To ease visualization, we represent $\mathbf{V}$ and $\mathbf{M}$ on the same basis, although this is not necessary for the models presented in this work. For each species pair, the multiplicity of each irrep depends on the orbital pairs that can contribute to the symmetry block, as determined by angular momentum coupling, while the number of such elements in each block is set by the distinct atom pairs $ij$ corresponding to the species. For example, for the species pair (O,H), contributions arise from the atom pairs (O,$\text{H}_1$) and (O,$\text{H}_2$), while for the species pair (H,H), $ij$ corresponds to atom pairs ($\text{H}_1$,$\text{H}_1$), ($\text{H}_1$,$\text{H}_2$) and ($\text{H}_2$,$\text{H}_2$). A linear model (blue) maps each irrep block of the input to the corresponding irrep block of the output, with block-specific weights. For Op2Prop models (green), symmetry-adapted blocks of the external potential are summed over all atom pairs and species blocks are concatenated to produce structure-level features for property prediction.
  • Figure 3: Examples demonstrating the role of basis resolution and matrix products in distinguishing molecular structures. (a) Distorted octahedral geometries, showing a reference atom pair $i, j$ separated by distance $r$ and neighboring atoms placed on a circle in the equatorial plane. (b) Table summarizing whether matrix elements $\mathbf{V}(A_{ij})$ and $\mathbf{V}(A'_{ij})$ are distinguishable for basis sets truncated at the listed orbitals for the pair of configurations. Randomly distorted octahedra become distinguishable when $p$ orbitals are included, while regular octahedra require $d$ orbitals due to their higher symmetry. (c) Pair of structures adapted from Ref. pozdnyakov2020completeness with $\theta = \pi/4$ radians, which have identical three-body correlations and lead to degenerate matrix elements in an $s$-only basis. (d) Table showing that these structures become distinguishable when matrix products ($\kappa_{\max} = 2$) are included, as information from multiple atom pairs is combined through matrix multiplication.
  • Figure 4: Considering only the invariant features ($\lambda = 0, \sigma=1$), for water molecules (a), organic molecules from the QM7 dataset (b), we plot the error of constructing atom-centered density correlations $\rho^{\otimes 2}$ using input $\mathbf{V}^{\kappa_\text{max}}$ (blue) and vice-versa (red). The reconstruction error is reported as a function of the matrix power, which makes the representation of the external potential increasingly nonlocal.
  • Figure 5: Interaction energy for water dimers (total energy baselined by the mean $\overline{E}$) as a function of the distance $d$ between the two monomers. The shaded region shows distances incorporated in the training set. Reference values (black dashed) compared against predictions from the atom-centered descriptors $\rho^{\otimes 2}$ and products of the external potential matrix up to maximum algebraic power $\kappa_\text{max}$ (blue).
  • ...and 5 more figures