Table of Contents
Fetching ...

Coupled Cluster con MōLe: Molecular Orbital Learning for Neural Wavefunctions

Luca Thiede, Abdulrahman Aldossary, Andreas Burger, Jorge Arturo Campos-Gonzalez-Angulo, Ning Wang, Alexander Zook, Melisa Alkan, Kouhei Nakaji, Taylor Lee Patti, Jérôme Florian Gonthier, Mohammad Ghazi Vakili, Alán Aspuru-Guzik

TL;DR

The Molecular Orbital Learning (M\=oLe) architecture is presented, an equivariant machine learning model that directly predicts CC's core mathematical objects, the excitation amplitudes, from the mean-field Hartree-Fock molecular orbitals as inputs, setting the foundations for high-accuracy wavefunction-based ML architectures to accelerate molecular design and complement force-field approaches.

Abstract

Density functional theory (DFT) is the most widely used method for calculating molecular properties; however, its accuracy is often insufficient for quantitative predictions. Coupled-cluster (CC) theory is the most successful method for achieving accuracy beyond DFT and for predicting properties that closely align with experiment. It is known as the ''gold standard'' of quantum chemistry. Unfortunately, the high computational cost of CC limits its widespread applicability. In this work, we present the Molecular Orbital Learning (MōLe) architecture, an equivariant machine learning model that directly predicts CC's core mathematical objects, the excitation amplitudes, from the mean-field Hartree-Fock molecular orbitals as inputs. We test various aspects of our model and demonstrate its remarkable data efficiency and out-of-distribution generalization to larger molecules and off-equilibrium geometries, despite being trained only on small equilibrium geometries. Finally, we also examine its ability to reduce the number of cycles required to converge CC calculations. MōLe can set the foundations for high-accuracy wavefunction-based ML architectures to accelerate molecular design and complement force-field approaches.

Coupled Cluster con MōLe: Molecular Orbital Learning for Neural Wavefunctions

TL;DR

The Molecular Orbital Learning (M\=oLe) architecture is presented, an equivariant machine learning model that directly predicts CC's core mathematical objects, the excitation amplitudes, from the mean-field Hartree-Fock molecular orbitals as inputs, setting the foundations for high-accuracy wavefunction-based ML architectures to accelerate molecular design and complement force-field approaches.

Abstract

Density functional theory (DFT) is the most widely used method for calculating molecular properties; however, its accuracy is often insufficient for quantitative predictions. Coupled-cluster (CC) theory is the most successful method for achieving accuracy beyond DFT and for predicting properties that closely align with experiment. It is known as the ''gold standard'' of quantum chemistry. Unfortunately, the high computational cost of CC limits its widespread applicability. In this work, we present the Molecular Orbital Learning (MōLe) architecture, an equivariant machine learning model that directly predicts CC's core mathematical objects, the excitation amplitudes, from the mean-field Hartree-Fock molecular orbitals as inputs. We test various aspects of our model and demonstrate its remarkable data efficiency and out-of-distribution generalization to larger molecules and off-equilibrium geometries, despite being trained only on small equilibrium geometries. Finally, we also examine its ability to reduce the number of cycles required to converge CC calculations. MōLe can set the foundations for high-accuracy wavefunction-based ML architectures to accelerate molecular design and complement force-field approaches.
Paper Structure (48 sections, 68 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 48 sections, 68 equations, 7 figures, 7 tables, 1 algorithm.

Figures (7)

  • Figure 1: Given a molecule, a Hartree-Fock calculation provides the molecular orbitals represented by their coefficient matrix $\mathbf{C}$. The coefficient vector is padded for each atom to ensure that they all have the same number of basis coefficients, enabling their embedding in an equivariant neural network. The model then alternates message passing to mix information within the MOs and attention layers to mix information between MOs. Finally, the embeddings are read out by "outer product-like" operations, outputting the $T_1$ and $T_2$ amplitudes.
  • Figure 2: On the left, we are summarizing the key components of the MōLe architecture, consisting of a padding and embedding layer, $(T)$ transformer layers, and finally the $T_1$ and $T_2$ readout. On the right, we are further detailing the attention mechanism, as well as the $T_1$ and $T_2$ readout of our architecture.
  • Figure 3: The energy error of MōLe, MP2+MACE (i.e., $\Delta$-MP2), and MACE along two scans. In the inset, the potential energy surface is shown, with the black line indicating the ground truth energies. MōLe achieves lower error particularly for the transition state region, while MACE would overestimate the activation energies.
  • Figure 4: The electron density error of MP2 and MōLe on L-Arginine amino acid. The error is plotted at the 95% percentile of the MP2 error.
  • Figure 5: Left: Scaling the transformer's depth monotonically decreases the prediction error up to four layers. Right: Scaling the transformer width also decreases the prediction error monotonically.
  • ...and 2 more figures