Table of Contents
Fetching ...

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

Erpai Luo, Xinran Wei, Lin Huang, Yunyang Li, Han Yang, Zaishuo Xia, Zun Wang, Chang Liu, Bin Shao, Jia Zhang

TL;DR

SPHNet introduces adaptive sparsity for SE(3) equivariant Hamiltonian prediction, reducing the computational burden of high-order tensor products while preserving accuracy. It uses two sparse gates (Sparse Pair Gate and Sparse Tensor Product Gate) guided by a Three-phase Sparsity Scheduler to achieve up to 70% sparsity and up to 7x speedups on large molecular systems. The approach demonstrates state-of-the-art performance on QH9 and PubChemQH, with notable memory savings and scalable behavior on large Hamiltonians, and shows promise for broad applicability to other SE(3) networks. The work provides a practical pathway to scalable quantum-property predictions and suggests future extensions to additional tasks and architectures, with code available at https://github.com/microsoft/SPHNet.

Abstract

Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success in this domain, their substantial computational cost--driven by high-order tensor product (TP) operations--restricts their scalability to large molecular systems with extensive basis sets. To address this challenge, we introduce SPHNet, an efficient and scalable equivariant network, that incorporates adaptive SParsity into Hamiltonian prediction. SPHNet employs two innovative sparse gates to selectively constrain non-critical interaction combinations, significantly reducing tensor product computations while maintaining accuracy. To optimize the sparse representation, we develop a Three-phase Sparsity Scheduler, ensuring stable convergence and achieving high performance at sparsity rates of up to 70%. Extensive evaluations on QH9 and PubchemQH datasets demonstrate that SPHNet achieves state-of-the-art accuracy while providing up to a 7x speedup over existing models. Beyond Hamiltonian prediction, the proposed sparsification techniques also hold significant potential for improving the efficiency and scalability of other SE(3) equivariant networks, further broadening their applicability and impact. Our code can be found at https://github.com/microsoft/SPHNet.

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

TL;DR

SPHNet introduces adaptive sparsity for SE(3) equivariant Hamiltonian prediction, reducing the computational burden of high-order tensor products while preserving accuracy. It uses two sparse gates (Sparse Pair Gate and Sparse Tensor Product Gate) guided by a Three-phase Sparsity Scheduler to achieve up to 70% sparsity and up to 7x speedups on large molecular systems. The approach demonstrates state-of-the-art performance on QH9 and PubChemQH, with notable memory savings and scalable behavior on large Hamiltonians, and shows promise for broad applicability to other SE(3) networks. The work provides a practical pathway to scalable quantum-property predictions and suggests future extensions to additional tasks and architectures, with code available at https://github.com/microsoft/SPHNet.

Abstract

Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success in this domain, their substantial computational cost--driven by high-order tensor product (TP) operations--restricts their scalability to large molecular systems with extensive basis sets. To address this challenge, we introduce SPHNet, an efficient and scalable equivariant network, that incorporates adaptive SParsity into Hamiltonian prediction. SPHNet employs two innovative sparse gates to selectively constrain non-critical interaction combinations, significantly reducing tensor product computations while maintaining accuracy. To optimize the sparse representation, we develop a Three-phase Sparsity Scheduler, ensuring stable convergence and achieving high performance at sparsity rates of up to 70%. Extensive evaluations on QH9 and PubchemQH datasets demonstrate that SPHNet achieves state-of-the-art accuracy while providing up to a 7x speedup over existing models. Beyond Hamiltonian prediction, the proposed sparsification techniques also hold significant potential for improving the efficiency and scalability of other SE(3) equivariant networks, further broadening their applicability and impact. Our code can be found at https://github.com/microsoft/SPHNet.

Paper Structure

This paper contains 44 sections, 35 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: (A) The number of tensor products grows quadratically with the number of atoms $N$, as the Hamiltonian includes features for all possible atomic pair combinations. (B) The time cost of tensor products grows with the sixth power of their order $L$, where the increase in order corresponds to the expansion in the number of orbital types in the DFT basis set. For example, the def2-SVP basis set requires a maximum order of 4, while def2-TZVP demands an order of 6.
  • Figure 2: (A) The overall architecture of SPHNet. Atomic numbers and atomic coordinates are first passed through the Vectorial Node Interaction Blocks to obtain atomic features $\mathbf{x}_i^{\ell}$. Subsequently, the Sparse Pair Gate selects the key pair set $(i, j)$ for the Spherical Node Interaction Blocks, where the irreps $\mathbf{x}_i^{\ell}$ are elevated from the $\ell = 1$ to $L_{\text{max}}$ during the interaction process. Next, the Sparse Tensor Product Gate in the construction block identifies the key cross-order combinations $(\ell_1, \ell_2, \ell_3)$ for the diagonal blocks, yielding diagonal pair features $\mathbf{f}_{ii}$. For non-diagonal blocks, both the Pair Gate and Tensor Product Gate are applied to select the critical pairs and tensor product combinations, producing non-diagonal pair features $\mathbf{f}_{ij}$. Finally, these features are fed into the expansion block to construct the predicted Hamiltonian matrix. (B) Sparse Pair Gate: It takes pairwise features as input, computes weights for each pair $(i, j)$, and selects a optimal subset using the sparsity scheduler. (C) Sparse Tensor Product Gate: Similarly, it utilizes the sparsity scheduler to identify an optimal subset of cross-order combinations $(\ell_1, \ell_2, \ell_3)$ based on learnable weights. (D) Three-Phase Sparsity Scheduler: Designed for the sparse gates, it operates in three phases: random, adaptive, and fixed.
  • Figure 3: The effect of different sparsity rates on the model performance on the datasets with different scale of molecules. See the detailed computational cost scaling in Appendix \ref{['sec:sp_rate_perform']} of PubChemQH Dataset.
  • Figure 4: (A) The distribution of distance between selected node pairs and the selected proportions of different pair distances in the pair gate. (B) The proportions of each output order being selected in the first Non-Diagonal block's tensor product gate. See the full cases of selected pairs in Appendix \ref{['sec:sp_rate_set']}.
  • Figure 5: Comparison of the training speed with the increasing size of Hamiltonian.
  • ...and 4 more figures