Table of Contents
Fetching ...

Physics-Informed Long-Range Coulomb Correction for Machine-learning Hamiltonians

Yang Zhong, Xiwen Li, Xingao Gong, Hongjun Xiang

Abstract

Machine-learning electronic Hamiltonians achieve orders-of-magnitude speedups over density-functional theory, yet current models omit long-range Coulomb interactions that govern physics in polar crystals and heterostructures. We derive closed-form long-range Hamiltonian matrix elements in a nonorthogonal atomic-orbital basis through variational decomposition of the electrostatic energy, deriving a variationally consistent mapping from the electron density matrix to effective atomic charges. We implement this framework in HamGNN-LR, a dual-channel architecture combining E(3)-equivariant message passing with reciprocal-space Ewald summation. Benchmarks demonstrate that physics-based long-range corrections are essential: purely data-driven attention mechanisms fail to capture macroscopic electrostatic potentials. Benchmarks on polar ZnO slabs, CdSe/ZnS heterostructures, and GaN/AlN superlattices show two- to threefold error reductions and robust transferability to systems far beyond training sizes, eliminating the characteristic staircase artifacts that plague short-range models in the presence of built-in electric fields.

Physics-Informed Long-Range Coulomb Correction for Machine-learning Hamiltonians

Abstract

Machine-learning electronic Hamiltonians achieve orders-of-magnitude speedups over density-functional theory, yet current models omit long-range Coulomb interactions that govern physics in polar crystals and heterostructures. We derive closed-form long-range Hamiltonian matrix elements in a nonorthogonal atomic-orbital basis through variational decomposition of the electrostatic energy, deriving a variationally consistent mapping from the electron density matrix to effective atomic charges. We implement this framework in HamGNN-LR, a dual-channel architecture combining E(3)-equivariant message passing with reciprocal-space Ewald summation. Benchmarks demonstrate that physics-based long-range corrections are essential: purely data-driven attention mechanisms fail to capture macroscopic electrostatic potentials. Benchmarks on polar ZnO slabs, CdSe/ZnS heterostructures, and GaN/AlN superlattices show two- to threefold error reductions and robust transferability to systems far beyond training sizes, eliminating the characteristic staircase artifacts that plague short-range models in the presence of built-in electric fields.
Paper Structure (29 sections, 39 equations, 3 figures, 2 tables)

This paper contains 29 sections, 39 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Architecture of HamGNN-LR for predicting $H_{\text{total}} = H_{\text{sr}} + H_{\text{lr}}$. Short-range E(3)-equivariant message passing constructs $H_{\text{sr}}$ from local environments. An Ewald attention module captures long-range correlations in reciprocal space with $\mathcal{O}(N|\mathcal{K}|)$ cost, linear in system size $N$ for fixed sampled wave vectors $|\mathcal{K}|$, and decodes ionic charges $Q_i$ and weight matrices $W_{i,\mu\nu}$. The decoded charges are neutralized ($Q_j \!\mapsto\! \tilde{Q}_j$) and, together with the weight matrices $W_{i,\mu\nu}$, substituted into the analytic correction $H_{\text{lr}}$ [Eq. (\ref{['eq:longrange-ham']})].
  • Figure 2: Hamiltonian MAE of single-layer SR (red circles) and LR (blue squares) models versus layer thickness for (a) GaN/AlN superlattices and (b) polar ZnO slabs. In (a), SR error grows with superlattice period and saturates at $\sim$140 Å; in (b), SR error rises with slab thickness before leveling off. The LR model [Eq. (\ref{['eq:longrange-ham']})] maintains uniformly low errors across all thicknesses.
  • Figure 3: Real-space onsite energy and band structure of the 56-layer polar ZnO slab. (a) Structural schematic; red arrow denotes spontaneous polarization $\mathbf{P}$. (b)--(e) Spatially resolved average onsite Hamiltonian matrix elements along $z$ for (b) DeepH-E3, (c) DeePTB-E3, (d) short-range HamGNN, and (e) HamGNN-LR. Open symbols: DFT reference; solid lines: model predictions. Shaded regions in (b)--(d) highlight staircase discontinuities arising from finite receptive fields; these artifacts are absent in (e) owing to the long-range correction. (f)--(h) Band structures along $\Gamma$--$K$--$M$--$\Gamma$ for (f) short-range HamGNN, (g) HamGNN-LR, and (h) DFT reference. Dashed red lines mark the Fermi level. HamGNN-LR closely reproduces DFT bands, whereas the short-range model exhibits visible deviations near the band gap.