Table of Contents
Fetching ...

Learning local equivariant representations for quantum operators

Zhanghao Zhouyin, Zixi Gan, MingKang Liu, Shishir Kumar Pandey, Linfeng Zhang, Qiangqiang Gu

TL;DR

This work tackles the challenge of learning representations for quantum operator matrices within DFT, introducing SLEM, a strictly localized equivariant message-passing model. By enforcing strict locality and leveraging an SO(2) convolution, SLEM learns Hamiltonians, density matrices, and overlap matrices with high accuracy while dramatically reducing computational costs, even for high-order orbital bases. The model achieves state-of-the-art results on 2D/3D materials, demonstrates excellent data efficiency and transferability, and enables scalable parallelization for large systems. Together with a SK-based overlap parameterization and open-source tools, SLEM promises to extend the reach of data-driven quantum simulations in materials science.

Abstract

Predicting quantum operator matrices such as Hamiltonian, overlap, and density matrices in the density functional theory (DFT) framework is crucial for material science. Current methods often focus on individual operators and struggle with efficiency and scalability for large systems. Here we introduce a novel deep learning model, SLEM (strictly localized equivariant message-passing) for predicting multiple quantum operators, that achieves state-of-the-art accuracy while dramatically improving computational efficiency. SLEM's key innovation is its strict locality-based design for equivariant representations of quantum tensors while preserving physical symmetries. This enables complex many-body dependency without expanding the effective receptive field, leading to superior data efficiency and transferability. Using an innovative SO(2) convolution and invariant overlap parameterization, SLEM reduces the computational complexity of high-order tensor products and is therefore capable of handling systems requiring the $f$ and $g$ orbitals in their basis sets. We demonstrate SLEM's capabilities across diverse 2D and 3D materials, achieving high accuracy even with limited training data. SLEM's design facilitates efficient parallelization, potentially extending DFT simulations to systems with device-level sizes, opening new possibilities for large-scale quantum simulations and high-throughput materials discovery.

Learning local equivariant representations for quantum operators

TL;DR

This work tackles the challenge of learning representations for quantum operator matrices within DFT, introducing SLEM, a strictly localized equivariant message-passing model. By enforcing strict locality and leveraging an SO(2) convolution, SLEM learns Hamiltonians, density matrices, and overlap matrices with high accuracy while dramatically reducing computational costs, even for high-order orbital bases. The model achieves state-of-the-art results on 2D/3D materials, demonstrates excellent data efficiency and transferability, and enables scalable parallelization for large systems. Together with a SK-based overlap parameterization and open-source tools, SLEM promises to extend the reach of data-driven quantum simulations in materials science.

Abstract

Predicting quantum operator matrices such as Hamiltonian, overlap, and density matrices in the density functional theory (DFT) framework is crucial for material science. Current methods often focus on individual operators and struggle with efficiency and scalability for large systems. Here we introduce a novel deep learning model, SLEM (strictly localized equivariant message-passing) for predicting multiple quantum operators, that achieves state-of-the-art accuracy while dramatically improving computational efficiency. SLEM's key innovation is its strict locality-based design for equivariant representations of quantum tensors while preserving physical symmetries. This enables complex many-body dependency without expanding the effective receptive field, leading to superior data efficiency and transferability. Using an innovative SO(2) convolution and invariant overlap parameterization, SLEM reduces the computational complexity of high-order tensor products and is therefore capable of handling systems requiring the and orbitals in their basis sets. We demonstrate SLEM's capabilities across diverse 2D and 3D materials, achieving high accuracy even with limited training data. SLEM's design facilitates efficient parallelization, potentially extending DFT simulations to systems with device-level sizes, opening new possibilities for large-scale quantum simulations and high-throughput materials discovery.
Paper Structure (26 sections, 27 equations, 5 figures, 6 tables)

This paper contains 26 sections, 27 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Local design of SLEM vs MPNN on 1D graph.(a) MPNN aggregation. (b) SLEM aggregation. Balls: nodes, sticks: edges, arrows: aggregation direction. $r_{\text{cut}}$: predefined cutoff; $r_{\text{eff}}$: effective cutoff after 2 layer updates. $L$: layer index.
  • Figure 2: Design of the SLEM model. (a) Hierarchical structure of the model. Starts with the atomic number $Z_i$, the radial and spherical part of the shift vector $\bm{r}_{ij}$ and $\mathbf{Y}^{ij}_{c,l}$, the initialized hidden features $\mathbf{x}^{ij}$, $\mathbf{V}^{ij}$, along with edge and node features $\mathbf{e}^{ij}$, $\mathbf{n}^{i}$, are generated. The two-body hidden features predict the SK parameters constructing (off-)diagonal blocks for the overlap operator. Others features are then iteratively updated using the designed strictly localized updating scheme. (b)-(d) shows the hidden update (b), edge update (c), and node update (d). Node and edge features are used to construct the diagonal blocks for quantum operators.
  • Figure 3: Comparison of band structures for a Si MD trajectory snapshot: SLEM prediction vs. DFT calculation. Predicted band structures are obtained from either diagonalization of the predicted Hamiltonian or NSCF DFT calculation using predicted charge density, yielding indistinguishable results. Inset: Visualization of charge density distribution for the same structure.
  • Figure 4: Comparison of time and memory consumption for different tensor-product implementations. (a) Time consumption vs. angular momentum ($l$) for different models, including the SO(2)-based SLEM model (triangles) with and without radial part ($r$), DeepH-E3 (cross) gong2023general, and E3NN (square) e3nn_paper models. Inset: Log-scale fit with slopes of 1.2 for the SLEM model and 3.7 for the other two models. (b) Memory consumption vs. $l$. The SLEM model demonstrates over two orders of magnitude improvement in both time and memory efficiency compared to DeepH-E3 and E3NN.
  • Figure 5: Comparison of training time per iteration and memory consumption with SLEM and DeepH-E3 gong2023general models.