Table of Contents
Fetching ...

Neural network backflow for ab-initio solid calculations

An-Jun Liu, Bryan K. Clark

Abstract

Accurately simulating extended periodic systems is a central challenge in condensed matter physics. Neural quantum states (NQS) offer expressive wavefunctions for this task but face issues with scalability. In this work, we successfully extend the neural network backflow (NNBF) approach to ab-initio solid-state materials. Building on our scalable optimization framework for molecules [Liu et al., PRB 112, 155162 (2025)], we introduce a two-stage pruning strategy to manage the massive configuration space expansions: by utilizing a computationally cheap, physics-informed importance proxy, we devote exact NNBF amplitude evaluations solely to the most relevant determinants, significantly improving optimization efficiency, energy estimation, and convergence. Our framework achieves state-of-the-art accuracy across diverse solid-state benchmarks. For 1D hydrogen chains, NNBF matches or surpasses DMRG and AFQMC, remains robust in strongly correlated bond-breaking regimes where coupled-cluster methods fail, and smoothly extrapolates to the thermodynamic limit. We further demonstrate its scalability by computing ground-state potential energy curves for 2D graphene and 3D silicon. Finally, ablation studies validate the computational savings of our pruning strategy and highlight the dependence of the NNBF energies on basis sets.

Neural network backflow for ab-initio solid calculations

Abstract

Accurately simulating extended periodic systems is a central challenge in condensed matter physics. Neural quantum states (NQS) offer expressive wavefunctions for this task but face issues with scalability. In this work, we successfully extend the neural network backflow (NNBF) approach to ab-initio solid-state materials. Building on our scalable optimization framework for molecules [Liu et al., PRB 112, 155162 (2025)], we introduce a two-stage pruning strategy to manage the massive configuration space expansions: by utilizing a computationally cheap, physics-informed importance proxy, we devote exact NNBF amplitude evaluations solely to the most relevant determinants, significantly improving optimization efficiency, energy estimation, and convergence. Our framework achieves state-of-the-art accuracy across diverse solid-state benchmarks. For 1D hydrogen chains, NNBF matches or surpasses DMRG and AFQMC, remains robust in strongly correlated bond-breaking regimes where coupled-cluster methods fail, and smoothly extrapolates to the thermodynamic limit. We further demonstrate its scalability by computing ground-state potential energy curves for 2D graphene and 3D silicon. Finally, ablation studies validate the computational savings of our pruning strategy and highlight the dependence of the NNBF energies on basis sets.
Paper Structure (14 sections, 6 equations, 11 figures, 1 table)

This paper contains 14 sections, 6 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Schematic representation of the two-stage pruning algorithm. Circle 1: The core space $\mathcal{V}$ consists of the $\abs{\mathcal{V}}$ configurations with the largest amplitude moduli from the previous target space $\mathcal{U}$. Circle 2: A connected space $\mathcal{C}$ is expanded from $\mathcal{V}$ via non-zero Hamiltonian matrix elements. Each connected configuration $\ket{\mathbf{x}_j}\in\mathcal{C}$ may link to multiple core configurations $\ket{\mathbf{x}_i}\in\mathcal{V}$, and these connection strengths are quantified as $\abs{H_{ij}\psi_\theta(\mathbf{x}_i)}$. Circle 3: The importance score of each connected configuration $\ket{\mathbf{x}_j}$ is then defined as its maximum connection strength: $I(\mathbf{x}_j) = \max_{\ket{\mathbf{x}_i} \in \mathcal{V}} \left| \psi_\theta(\mathbf{x}_i) H_{ij} \right|$. Circle 4: The first stage of pruning selects the $\abs{\mathcal{V}}N_{conn}/r$ unique connected configurations with the largest importance scores to form an intermediate pool space $\mathcal{P}$. The predefined reduction factor r ensures that the size of $\mathcal{P}$ matches the number of configurations that would be generated by fully expanding a reduced core space of size $\abs{\mathcal{V}}/r$. Circle 5: The second stage of pruning calculates the exact NNBF amplitudes for the combined space $\mathcal{P}\cup\mathcal{MC}$, where $\mathcal{MC}$ represents configurations proposed by persistent MCMC walkers running concurrently to provide stochastic exploration. Finally, the $\abs{\mathcal{V}}N_{conn}/rl$ elements with the largest amplitude moduli are selected to form the new target space $\mathcal{U}$. This space $\mathcal{U}$ is fixed and used for the subsequent $l$ steps, where $l$ is a predefined speedup factor.
  • Figure 2: Potential energy curve for H$_{10}$ under OBC using the STO-6G basis set. The energy errors of our NNBF ansatz, conventional quantum chemistry methods [CCSD, CCSD(T)], AFQMC, and DMRG are plotted relative to the exact FCI reference energy. The reported NNBF energies are obtained via the following protocol: three independent training runs using split localized Pipek--Mezey (PM) molecular orbitals are performed at each atomic separation (see Appendix \ref{['appx:experimental_setup']} for detailed settings), and a post-training MCMC inference is used to evaluate the variational energy of each run. The model yielding the lowest of these energies is selected. A final, independent MCMC inference is then conducted on this optimal model to obtain the unbiased energy estimate displayed in the figure. Data for all other benchmark methods are taken from Ref. Motta2017.
  • Figure 3: Potential energy curve at the thermodynamic limit (TDL) for the linear hydrogen chain under OBC using the STO-6G basis set. The TDL-extrapolated energies of our NNBF ansatz, conventional quantum chemistry methods [CCSD, CCSD(T)], and AFQMC are plotted relative to the TDL-extrapolated DMRG reference. Details of the NNBF TDL extrapolation are provided in Appendix \ref{['appx:OBC_TDL_extrapolation']}. Data for all other benchmark methods are taken from Ref. Motta2017.
  • Figure 4: Potential energy curve for a H$_{10}$ under PBC using the STO-6G basis set. A unit cell of two hydrogen atoms is used, and the H$_{10}$ chain is modeled as a $5\times1\times1$ supercell. The energy errors of our NNBF ansatz and conventional quantum chemistry methods [HF, MP2, CISD, CCSD, CCSD(T)] are plotted relative to the exact FCI reference energy. The reported NNBF energies are obtained following the exact same training and evaluation protocol detailed in Fig. \ref{['fig:H10_OBC_curve']}. Data for all other benchmark methods are computed using PySCFpyscf.
  • Figure 5: Ground-state energy convergence toward the thermodynamic limit (TDL) for the periodic hydrogen chain at an atomic separation of $r = 1.8$ Bohr in STO-6G basis set. Calculations are performed using our NNBF ansatz alongside conventional quantum chemistry methods [CISD, CCSD, CCSD(T)]. The system is modeled using a unit cell of two hydrogen atoms. The upper and lower branches for each method correspond to supercells containing an even and odd number of unit cells, respectively. The extrapolation to the macroscopic limit ($N \to \infty$) is carried out by independently fitting the even and odd sequences to the finite-size scaling formula $E(N) = E_0 + B N^{-2}$. The reported NNBF energies are obtained following the exact same training and evaluation protocol detailed in Fig. \ref{['fig:H10_OBC_curve']}. Data for all other benchmark methods are computed using PySCFpyscf.
  • ...and 6 more figures