Table of Contents
Fetching ...

Efficient, Equivariant Predictions of Distributed Charge Models

Eric D. Boittier, Markus Meuwly

TL;DR

The paper presents DCM-net, an $SO(3)$-equivariant graph neural network that predicts distributed charges per atom to model the molecular ESP with controlled anisotropy. By learning $n_{ m DC}$ off-center charges, the method bridges atom-centered monopoles and full multipole expansions, achieving ESP accuracies approaching MBIS dipoles (for $n_{ m DC}=2$) and quadrupoles (for $n_{ m DC}=3$–$4$) while remaining transferable across conformations and chemical space. Training on QM9 and CO$_2$ conformers, with transfer learning to non-equilibrium dipeptides, demonstrates substantial ESP improvements over MBIS monopoles and competitive performance with higher-order multipoles, along with symmetry-consistent minimal charge models suitable for ML/MM force fields. Overall, DCM-net provides a fast, physically grounded route to anisotropic electrostatics, enabling scalable, transferable distributed charge representations and streamlined force-field development, with equivariance serving as a principled design choice to ensure correct 3D behavior.

Abstract

A machine learning (ML) based equivariant neural network for constructing distributed charge models (DCMs) of arbitrary resolution, DCM-net, is presented. DCMs efficiently and accurately model the anisotropy of the molecular electrostatic potential (ESP) and go beyond the point charge representation used in conventional molecular mechanics (MM) energy functions. This is particularly relevant for capturing the conformational dependence of the ESP (internal polarization) and chemically relevant features such as lone pairs or σ-holes. Across conformational space, the learned charge positions from DCM-net are stable and continuous. Across the QM9 chemical space, two-charge-per-atom models achieve accuracies comparable to fitted atomic dipoles for previously unseen molecules (0.75 (kcal/mol)/e). Three- and four-charge-per-atom models reach accuracies competitive with atomistic multipole expansions up to quadrupole level (0.55 (kcal/mol)/e). Pronounced improvements of the ESP are found around O and F atoms, both of which are known to feature strongly anisotropic fields, and for aromatic systems. Across the QM9 reference data set, molecular dipole moments improve by 0.1 D compared with fitted monopoles. Transfer learning on dipeptides yields a 0.2 (kcal/mol)/e ESP improvement for unseen samples and a two-fold MAE reduction for molecular dipole moments versus fitted monopoles. Overall, DCM-net offers a fast and physically meaningful approach to generating distributed charge models for running pure ML or mixed ML/MM based molecular simulations. level (0.55 (kcal/mol)/e).

Efficient, Equivariant Predictions of Distributed Charge Models

TL;DR

The paper presents DCM-net, an -equivariant graph neural network that predicts distributed charges per atom to model the molecular ESP with controlled anisotropy. By learning off-center charges, the method bridges atom-centered monopoles and full multipole expansions, achieving ESP accuracies approaching MBIS dipoles (for ) and quadrupoles (for ) while remaining transferable across conformations and chemical space. Training on QM9 and CO conformers, with transfer learning to non-equilibrium dipeptides, demonstrates substantial ESP improvements over MBIS monopoles and competitive performance with higher-order multipoles, along with symmetry-consistent minimal charge models suitable for ML/MM force fields. Overall, DCM-net provides a fast, physically grounded route to anisotropic electrostatics, enabling scalable, transferable distributed charge representations and streamlined force-field development, with equivariance serving as a principled design choice to ensure correct 3D behavior.

Abstract

A machine learning (ML) based equivariant neural network for constructing distributed charge models (DCMs) of arbitrary resolution, DCM-net, is presented. DCMs efficiently and accurately model the anisotropy of the molecular electrostatic potential (ESP) and go beyond the point charge representation used in conventional molecular mechanics (MM) energy functions. This is particularly relevant for capturing the conformational dependence of the ESP (internal polarization) and chemically relevant features such as lone pairs or σ-holes. Across conformational space, the learned charge positions from DCM-net are stable and continuous. Across the QM9 chemical space, two-charge-per-atom models achieve accuracies comparable to fitted atomic dipoles for previously unseen molecules (0.75 (kcal/mol)/e). Three- and four-charge-per-atom models reach accuracies competitive with atomistic multipole expansions up to quadrupole level (0.55 (kcal/mol)/e). Pronounced improvements of the ESP are found around O and F atoms, both of which are known to feature strongly anisotropic fields, and for aromatic systems. Across the QM9 reference data set, molecular dipole moments improve by 0.1 D compared with fitted monopoles. Transfer learning on dipeptides yields a 0.2 (kcal/mol)/e ESP improvement for unseen samples and a two-fold MAE reduction for molecular dipole moments versus fitted monopoles. Overall, DCM-net offers a fast and physically meaningful approach to generating distributed charge models for running pure ML or mixed ML/MM based molecular simulations. level (0.55 (kcal/mol)/e).
Paper Structure (15 sections, 13 equations, 11 figures, 4 tables)

This paper contains 15 sections, 13 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Reproducing the ESP: a graphical illustration of equations \ref{['eq:mono']}, \ref{['eq:dip']}, and \ref{['eq:quad']} using a computer render of colored 'point lights' in glass. a) The standard point charge representations of the monopole, dipole, and quadrupole moments. b) Aligning the dipole perpendicularly to the quadrupole directions gives a distributed charge solution for a system with vanishing monopole moment. Distributed charges offer a parameter efficient approach to fitting the ESP in volumes of interest (outside the glass, i.e. van der Waals region).
  • Figure 2: The architecture of DCM-net model. (A) The inputs to the network are atomic numbers and positions expanded into atom-atom distances. (B) During the message passing phase, the hidden representation is updated over $N_{\rm MP}$ iterations using the message passing operation (green). Dense layers (Eq. \ref{['eq:dense']}) and tensor products (Eq. \ref{['eq:tensor']}) are shown in yellow and red, respectively. (C) The final output is split between scalar features (monopoles) and vector-like features (charge displacements relative to atomic centers). (D) Predictions using $n_{\rm dc}$ distributed charges (red, blue) per atom (gray) can be combined and optimized to create 'minimal' models with acceptable accuracy and improved computational efficiency, and can be used in molecular dynamics simulations.
  • Figure 3: Conformational space: Dynamics of the DCM-net/PhysNet two charge model of CO$_2$ (a-c) Components of displacements for six distributed charges $\mathbf{\delta}$, labeled on the right hand side $DC_{1-6}$, coupled with (d) the internal angle $\theta$, shown along the $z$ axis which is aligned with bonds (e) $r_a$ and $r_b$. (f) Small displacements along the $x$ axis for the central carbon atom distributed charges ($DC_{1,2}$), and their magnitude $q$, couples the electrostatic potential to changes in bond length. For the symmetric case $(\theta =180^{\circ})$, displacements approach 0.0 Å, and charges $q$ approach their mean value.
  • Figure 4: Chemical space: QM9 (A) Distribution of RMSE$_{\mathrm{ESP}}$ values (difference between the model and reference ESP) for the entire hold-out (test) set, (B) contributions based on individual elements contributions for multipoles and (C) DCM-net. The normalized probability density is shown on the $y$-axis. Atom centered multipole expansions up to monopole (black), dipole (blue) and quadrupole (red) are shown along with the two-, three-, and four-charge-per atom models shown in pale green, green and dark green, respectively. The median of the distributions are shown with vertical lines. Note the change in height of the distributions for the DCM-net models due fatter tails in the distribution.
  • Figure 5: Chemical space: The spatial extent of the improvement introduced by DCM-net in comparison to MBIS point charges (note the color scale is qualitative; identifying regions of positive/negative errors and charge), for randomly selected test set examples labeled. Panel A (C$_3$H$_3$N$_4$OF) and panel B (C$_5$H$_7$N$_3$O), with (1, black) MBIS point charges, (2-4, gray) DCM-net predictions for $n_{\rm DC} = 2, 3, 4$, respectively. On the left, grid points with greater than 75% of the absolute error for the monopole ESP. On the right, molecular dipole vectors (scaled by $4\times$ for clarity) of (black) the ground truth, and of (red) the MBIS monopole and (yellow) DCM-net models. Errors, typically, decrease as the number of distributed charges per atom increases. Atoms are colored gray (carbon), white (hydrogen), red (oxygen), blue (nitrogen), and green (fluorine).
  • ...and 6 more figures