Table of Contents
Fetching ...

Accurate and scalable exchange-correlation with deep learning

Giulia Luise, Chin-Wei Huang, Thijs Vogels, Derk P. Kooi, Sebastian Ehlert, Stephanie Lanius, Klaas J. H. Giesbertz, Amir Karton, Deniz Gunceler, Megan Stanley, Wessel P. Bruinsma, Lin Huang, Xinran Wei, José Garrido Torres, Abylay Katbashev, Rodrigo Chavez Zavaleta, Bálint Máté, Sékou-Oumar Kaba, Roberto Sordillo, Yingrong Chen, David B. Williams-Young, Christopher M. Bishop, Jan Hermann, Rianne van den Berg, Paola Gori-Giorgi

TL;DR

This work introduces Skala, a deep-learning exchange-correlation functional that learns non-local density interactions from data while preserving the $O(N^3)$ scaling of semi-local DFT. By combining a data-rich training regime (including ~150k high-accuracy energies) with a scalable non-local architecture that uses coarse points and spherical harmonics, Skala achieves chemical accuracy on atomization energies and competitive performance across broad main-group chemistry benchmarks. It demonstrates robust predictions for densities and equilibrium geometries, and its cost remains close to traditional semi-local functionals, enabling practical first-principles simulations at scale. As training data continues to expand, Skala offers a pathway to systematic improvements in predictive DFT without sacrificing efficiency, with potential to catalyze in silico discovery across chemistry and materials science.

Abstract

Density Functional Theory (DFT) is the most widely used electronic structure method for predicting the properties of molecules and materials. Although DFT is, in principle, an exact reformulation of the Schrödinger equation, practical applications rely on approximations to the unknown exchange-correlation (XC) functional. Most existing XC functionals are constructed using a limited set of increasingly complex, hand-crafted features that improve accuracy at the expense of computational efficiency. Yet, no current approximation achieves the accuracy and generality for predictive modeling of laboratory experiments at chemical accuracy -- typically defined as errors below 1 kcal/mol. In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT. This performance is enabled by training on an unprecedented volume of high-accuracy reference data generated using computationally intensive wavefunction-based methods. Notably, Skala systematically improves with additional training data covering diverse chemistry. By incorporating a modest amount of additional high-accuracy data tailored to chemistry beyond atomization energies, Skala achieves accuracy competitive with the best-performing hybrid functionals across general main group chemistry, at the cost of semi-local DFT. As the training dataset continues to expand, Skala is poised to further enhance the predictive power of first-principles simulations.

Accurate and scalable exchange-correlation with deep learning

TL;DR

This work introduces Skala, a deep-learning exchange-correlation functional that learns non-local density interactions from data while preserving the scaling of semi-local DFT. By combining a data-rich training regime (including ~150k high-accuracy energies) with a scalable non-local architecture that uses coarse points and spherical harmonics, Skala achieves chemical accuracy on atomization energies and competitive performance across broad main-group chemistry benchmarks. It demonstrates robust predictions for densities and equilibrium geometries, and its cost remains close to traditional semi-local functionals, enabling practical first-principles simulations at scale. As training data continues to expand, Skala offers a pathway to systematic improvements in predictive DFT without sacrificing efficiency, with potential to catalyze in silico discovery across chemistry and materials science.

Abstract

Density Functional Theory (DFT) is the most widely used electronic structure method for predicting the properties of molecules and materials. Although DFT is, in principle, an exact reformulation of the Schrödinger equation, practical applications rely on approximations to the unknown exchange-correlation (XC) functional. Most existing XC functionals are constructed using a limited set of increasingly complex, hand-crafted features that improve accuracy at the expense of computational efficiency. Yet, no current approximation achieves the accuracy and generality for predictive modeling of laboratory experiments at chemical accuracy -- typically defined as errors below 1 kcal/mol. In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT. This performance is enabled by training on an unprecedented volume of high-accuracy reference data generated using computationally intensive wavefunction-based methods. Notably, Skala systematically improves with additional training data covering diverse chemistry. By incorporating a modest amount of additional high-accuracy data tailored to chemistry beyond atomization energies, Skala achieves accuracy competitive with the best-performing hybrid functionals across general main group chemistry, at the cost of semi-local DFT. As the training dataset continues to expand, Skala is poised to further enhance the predictive power of first-principles simulations.

Paper Structure

This paper contains 65 sections, 1 theorem, 43 equations, 15 figures, 16 tables.

Key Result

Theorem 1

Let $\kappa$ be a function in $L^2(\mathbb{R}^{3} \times \mathbb{R}^3)$ capturing the 2-body interaction between 3D coordinates $r_1$ and $r_2$. We assume $\kappa$ is globally rotationally invariant, that is for any $Q\in SO(3)$, we have Assume $\{\phi_c, Y_{\ell}^m\}$ is a basis set of $L^2(\mathbb{R}^3)$ and we consider 2 copies of it, indexed by $(c_1, \ell_1, m_1)$ and $(c_2, \ell_2, m_2)$ to

Figures (15)

  • Figure 1: Skala is a scalable deep learned exchange-correlation functional. (a) Jacob's ladder of density functional approximationsperdew_jacobs_2001 defines the rungs LDA, GGA and meta-GGA by expanding the set of semi-local features they extract from an electronic density matrix into a grid representation. The next rungs, hybrid and double hybrid extract more and more expensive wavefunction-based information directly from the density matrix. Skala departs from this ladder by extracting relatively cheap meta-GGA features, and instead gaining expressivity by learning non-local interactions between grid points at a manageable and controllable cost. (b) High-level overview of the neural network architecture for the Skala functional. (c) The plot's horizontal axis shows weighted total mean absolute deviation (WTMAD-2) on the GMTKN55goerigk_look_2017 test set for general main group thermochemistry, kinetics and non-covalent interactions. The vertical axis shows mean absolute error on the diverse atomization energies test set W4-17karton_w417_2017. Skala performs similarly to the best-performing hybrid functionals, and reaches near chemical accuracy (1 kcal/mol) on W4-17.
  • Figure 2: Mean absolute errors in kcal/mol on all GMTKN55 subsets. The datasets are grouped according to the categories reported in the original paper,goerigk_look_2017 and sorted by the mean absolute energy per dataset. The colors indicate the performance relative to $\omega$B97M-V, where blue means better and red means worse. The colorbar shows $10 \log_{10}(\text{error ratio})$, which has unit decibel.
  • Figure 3: Model insights. (a): Accuracy of Skala's nonlocal architecture compared with its local branch only, trained on all of the data in Extended Data Table \ref{['tab:training_data']}. (b): Data composition ablation from Extended Data Table \ref{['tab:training_data']}: results of training Skala on A, MSR-ACC/TAE only, on B, the public data NCIAtlas and W4-CC plus the Atomic datasets only, on A + B, and further adding all the other MSR-ACC data C. In both ablations, for each setting we trained three models using different random seeds. SCF fine-tuning was limited to 1000 steps, and evaluation was performed on the smaller Diet GMTKN55.gould_diet_2018. (c) The kinetic correlation component $T_\text{c}[\rho_{\gamma}]$ of $E_{\text{xc}}$ as a function of the density scaling parameter $\gamma$. The first four panels show results for models trained with the same data compositions as in panel b, while the fifth panel shows results of the final Skala functional, which was trained with more compute. Positive values indicate that exact constraint of $T_c$ being positive is satisfied, while negative values indicate violations.
  • Figure 4: Geometry optimization errors of various functionals
  • Figure 6:
  • ...and 10 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof