Table of Contents
Fetching ...

Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems

Yunyang Li, Zaishuo Xia, Lin Huang, Xinran Wei, Han Yang, Sam Harshe, Zun Wang, Chang Liu, Jia Zhang, Bin Shao, Mark B. Gerstein

TL;DR

This work tackles the scalability gap in predictive Hamiltonians for DFT by introducing Wavefunction Alignment Loss (WALoss) which aligns predicted and true eigenspaces, and WANet, a scalable SE(3)-aware architecture leveraging eSCN, a two-part Hamiltonian Head, and a mixture of long-short-range experts. The authors release PubChemQH, a large dataset with 40–100-atom molecules to study scaling and the SAD phenomenon, showing that elementwise Hamiltonian MAE alone poorly predicts ground-state energies for large systems. WALoss, combined with WANet, achieves unprecedented improvements in energy predictions and accelerates SCF convergence, while enabling accurate predictions of additional properties such as dipole moments and electronic extents. The work demonstrates substantial practical impact for large-scale quantum chemistry and materials science tasks, though it notes high computational cost for data generation and suggests directions for further efficiency and generalization enhancements.

Abstract

Density Functional Theory (DFT) is a pivotal method within quantum chemistry and materials science, with its core involving the construction and solution of the Kohn-Sham Hamiltonian. Despite its importance, the application of DFT is frequently limited by the substantial computational resources required to construct the Kohn-Sham Hamiltonian. In response to these limitations, current research has employed deep-learning models to efficiently predict molecular and solid Hamiltonians, with roto-translational symmetries encoded in their neural networks. However, the scalability of prior models may be problematic when applied to large molecules, resulting in non-physical predictions of ground-state properties. In this study, we generate a substantially larger training set (PubChemQH) than used previously and use it to create a scalable model for DFT calculations with physical accuracy. For our model, we introduce a loss function derived from physical principles, which we call Wavefunction Alignment Loss (WALoss). WALoss involves performing a basis change on the predicted Hamiltonian to align it with the observed one; thus, the resulting differences can serve as a surrogate for orbital energy differences, allowing models to make better predictions for molecular orbitals and total energies than previously possible. WALoss also substantially accelerates self-consistent-field (SCF) DFT calculations. Here, we show it achieves a reduction in total energy prediction error by a factor of 1347 and an SCF calculation speed-up by a factor of 18%. These substantial improvements set new benchmarks for achieving accurate and applicable predictions in larger molecular systems.

Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems

TL;DR

This work tackles the scalability gap in predictive Hamiltonians for DFT by introducing Wavefunction Alignment Loss (WALoss) which aligns predicted and true eigenspaces, and WANet, a scalable SE(3)-aware architecture leveraging eSCN, a two-part Hamiltonian Head, and a mixture of long-short-range experts. The authors release PubChemQH, a large dataset with 40–100-atom molecules to study scaling and the SAD phenomenon, showing that elementwise Hamiltonian MAE alone poorly predicts ground-state energies for large systems. WALoss, combined with WANet, achieves unprecedented improvements in energy predictions and accelerates SCF convergence, while enabling accurate predictions of additional properties such as dipole moments and electronic extents. The work demonstrates substantial practical impact for large-scale quantum chemistry and materials science tasks, though it notes high computational cost for data generation and suggests directions for further efficiency and generalization enhancements.

Abstract

Density Functional Theory (DFT) is a pivotal method within quantum chemistry and materials science, with its core involving the construction and solution of the Kohn-Sham Hamiltonian. Despite its importance, the application of DFT is frequently limited by the substantial computational resources required to construct the Kohn-Sham Hamiltonian. In response to these limitations, current research has employed deep-learning models to efficiently predict molecular and solid Hamiltonians, with roto-translational symmetries encoded in their neural networks. However, the scalability of prior models may be problematic when applied to large molecules, resulting in non-physical predictions of ground-state properties. In this study, we generate a substantially larger training set (PubChemQH) than used previously and use it to create a scalable model for DFT calculations with physical accuracy. For our model, we introduce a loss function derived from physical principles, which we call Wavefunction Alignment Loss (WALoss). WALoss involves performing a basis change on the predicted Hamiltonian to align it with the observed one; thus, the resulting differences can serve as a surrogate for orbital energy differences, allowing models to make better predictions for molecular orbitals and total energies than previously possible. WALoss also substantially accelerates self-consistent-field (SCF) DFT calculations. Here, we show it achieves a reduction in total energy prediction error by a factor of 1347 and an SCF calculation speed-up by a factor of 18%. These substantial improvements set new benchmarks for achieving accurate and applicable predictions in larger molecular systems.

Paper Structure

This paper contains 41 sections, 7 theorems, 110 equations, 14 figures, 14 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathbf{H}$ and $\hat{\mathbf{H}}$ represent Hamiltonian matrices, and $\mathbf{S}$ the overlap matrix. Define the perturbation matrix as $\Delta \mathbf{H} := \hat{\mathbf{H}} - \mathbf{H}$. Let $\lambda_i(\mathbf{H, S})$ and $\lambda_i(\hat{\mathbf{H}}, \mathbf{S})$ be the $i^{th}$ generalize

Figures (14)

  • Figure 1: Visualization of the SAD phenomenon. The y-axis (in symmetrical log scale webber2012bi) represents the system energy error derived from the perturbed Hamiltonian, while the x-axis shows the relative MAE, defined as $\frac{\mathrm{MAE}(\hat{\mathbf{H}}, \mathbf{H}^\star)}{\mathrm{Mean}(\mathbf{H}^\star)}$, where $\mathbf{H}^\star$ represents the ground-truth Hamiltonian and $\hat{\mathbf{H}}$ denotes the predicted or perturbed Hamiltonian. The relative MAE is induced by model learning errors or Gaussian perturbation. The Gaussian perturbation ensures that the perturbed matrix remains Hermitian. The initial guess line is derived from minaoalmlof1982principlesvan2006starting. Current state-of-the-art models, such as QHNet, achieve a relative MAE of up to $10^{-2}$ on PubChemQH molecules. For small molecules, a $10^{-2}$ relative MAE is sufficient for accurate system energy predictions (left panel), but this accuracy does not extend to large systems (right panel).
  • Figure 2: (a) We introduce PubChemQH, a new resource for Hamiltonian learning that facilitates the exploration of the scalability challenge known as SAD. (b) We present WANet, a modern architecture designed for accurate Hamiltonian prediction. WANet incorporates $\mathrm{SO}(2)$ convolution, a mixture of pair experts, and a many-body interaction layer. The mixture of pair experts constructs the non-diagonal block, while the many-body interaction layer constructs the diagonal block. (c) Our loss module, WALoss, performs a basis transformation of the predicted Hamiltonian using the ground-truth Hamiltonian and the overlap matrix. This enhancement aims to improve the applicability of the predicted Hamiltonian in real-world scenarios. Our final loss function combines MAE-MSE loss with WALoss.
  • Figure 3: Figure 2 (2.): Wall-clock comparison of WANet-Augmented DFT with traditional SCF iterations.
  • Figure 4: (3.): Comparison of training and inference efficiency and resource usage between QHNet and WANet on the PubChemQH dataset.
  • Figure 5: Model performance in predicting HOMO and LUMO energies for elongated alkanes. The left panel shows the MAE for HOMO predictions, while the right panel shows the MAE for LUMO predictions. "D1" indicates that the atom count is within the range of the PubChemQH dataset, whereas "D2" indicates that the atom count exceeds this range. The models compared include WANet with WALoss (w/ WALoss), our model without WALoss (w/o WALoss), and the initial guess (init guess). Notably, our model with WALoss demonstrates superior performance in LUMO predictions and matches the best HOMO performance, particularly in the "D2" region.
  • ...and 9 more figures

Theorems & Definitions (14)

  • Theorem 1
  • proof
  • Corollary 1: Perturbation Sensitivity Scaling
  • proof
  • Claim 1
  • Theorem 2
  • proof
  • Theorem 3: Weyl's Perturbation Theorem
  • Theorem 4: Davis-Kahan $\sin \theta$ Theorem
  • Theorem 5: Generalized Bai-Yin's law
  • ...and 4 more