Table of Contents
Fetching ...

DysonNet: Constant-Time Local Updates for Neural Quantum States

Lucas Winter, Andreas Nunnenkamp

Abstract

Neural quantum states (NQS) provide a flexible variational framework for many-body wavefunctions, but suffer from high computational cost and limited interpretability. We introduce DysonNet, a broad class of NQS that couples strictly local nonlinearities through global linear layers. This structure is analogous to a truncated Dyson series which gives an intuitive interpretation of local wavefunction updates as scattering from static impurities. By resumming the scattering series, single-spin-flip updates can be computed in $\mathcal{O}(1)$ time, independent of system size, using an algorithm we call ABACUS. Implementing DysonNet with the state-space model S4, we obtain up to $230\times$ speedups over Vision-Transformers for computing the local estimator. This corresponds to an asymptotic $\mathcal{O}(N^2)$ improvement in training-time scaling, reaching $\mathcal{O}(N \log^2 N)$ total training complexity in area-law phases. Benchmarks on the 1D long-range Ising model and frustrated $J_1$-$J_2$ chains show that DysonNet matches state-of-the-art NQS accuracy while removing the dominant local-update overhead. More broadly, our results suggest a route to scalable NQS architectures where physical interpretability directly enables computational efficiency.

DysonNet: Constant-Time Local Updates for Neural Quantum States

Abstract

Neural quantum states (NQS) provide a flexible variational framework for many-body wavefunctions, but suffer from high computational cost and limited interpretability. We introduce DysonNet, a broad class of NQS that couples strictly local nonlinearities through global linear layers. This structure is analogous to a truncated Dyson series which gives an intuitive interpretation of local wavefunction updates as scattering from static impurities. By resumming the scattering series, single-spin-flip updates can be computed in time, independent of system size, using an algorithm we call ABACUS. Implementing DysonNet with the state-space model S4, we obtain up to speedups over Vision-Transformers for computing the local estimator. This corresponds to an asymptotic improvement in training-time scaling, reaching total training complexity in area-law phases. Benchmarks on the 1D long-range Ising model and frustrated - chains show that DysonNet matches state-of-the-art NQS accuracy while removing the dominant local-update overhead. More broadly, our results suggest a route to scalable NQS architectures where physical interpretability directly enables computational efficiency.
Paper Structure (33 sections, 2 theorems, 122 equations, 8 figures, 5 tables, 2 algorithms)

This paper contains 33 sections, 2 theorems, 122 equations, 8 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

Given a linear token mixer $M^{(l)}$ and a local non-linearity $D^{(l)}(\boldsymbol \sigma)$, the recurrence eq:recur_step–eq:recur_out evaluates $\boldsymbol \Omega(\boldsymbol \sigma)$ exactly for any local update $\boldsymbol \sigma=\boldsymbol \sigma_0+\Delta\boldsymbol \sigma$. Given precompute

Figures (8)

  • Figure 1: (a) DysonNet Neural network architecture. Layers alternate global Green’s-function convolutions $G^{(l)}$ with local nonlinearity $D^{(l)}(\boldsymbol \sigma)$; the readout is the mean-pooling projector $P_M$. A single spin flip $\sigma_j\!\to\!-\sigma_j$ (red) modifies $D^{(l)}$ only within a slice of width $w$ around site $j$. Subsequent convolutions $G^{(l)}$ propagate this local change globally building long-range correlations. (b) ABACUS local update (Algorithm \ref{['alg:local-update-short']}). Slice-restricted activations are built recursively using the link tensors $L^{(l,m)}$ and $T^{(l)}$; see Eqs. \ref{['eq:recur_step']}–\ref{['eq:recur_out']}. (c) Practical realization of a DysonNet block. An embedded spin sequence splits into two streams: $\boldsymbol\phi$ (CNN path; local, nonlinear) and $\boldsymbol h$ (SSM path; nonlocal, kept linear in $\boldsymbol h$). Each of the $L$ stacked blocks updates $\boldsymbol \phi$ with a dense layer, a small-kernel CNN layer, and a SiLU nonlinearity. In the $\boldsymbol h$ stream we add $\boldsymbol{\phi}$, apply the SSM kernel and apply multiplicative gating via $D(\boldsymbol \phi)=\mathrm{SiLu}(\boldsymbol \phi)$. After $L$ blocks, mean pooling over positions yields the readout. Color/shape key: purple -- dense layers; green -- token mixers (CNN/SSM); orange -- normalization; blue -- projector (mean pooling); circle -- nonlinearity.
  • Figure 2: DysonNet outperforms ViT in ordered phases and on V-score. (a–d) Performance of DysonNet versus RBM and ViT neural-quantum states (NQS) on the long-range TFIM. (a,b) Energy difference $E_{\mathrm{DysonNet}}-E_{\mathrm{ViT}}$ (squares) and $E_{\mathrm{DysonNet}}-E_{\mathrm{RBM}}$ (triangles); (c,d) V-score. Left: sweep in coupling $J$ at fixed $\alpha=4$; right: sweep in $\alpha$ at fixed $J=4.75$. DysonNet attains substantially lower energies in the ordered (FM/AFM) regimes by $10^{-2}\!-\!10^{-3}$, while in the paramagnet ViT is slightly lower ($<10^{-4}$, effectively equal). DysonNet consistently attains lower energies than RBM by at least two orders of magnitude. DysonNet yields markedly smaller V-scores, especially in the short-range AFM regime (by up to two orders of magnitude vs ViT and four orders of magnitude vs RBM). Points are means over three training runs with identical sampler settings; error bars denote one s.d.; each model is trained for 400 iterations. System size $N=150$.
  • Figure 3: ABACUS enables constant-time local updates and improves asymptotic scaling. (a) Total training time for 400 iterations with Monte-Carlo sweep length $N/50$, using for each method the hyperparameters that give best overall performance (same choices as in Fig. \ref{['fig:ViTvCOBALT']}); at $N=500$, ViT requires $\sim60$ h while DysonNet+ABACUS finishes in $\sim2.5$ h. (b) Local-estimator runtime per connected matrix element with matched hyperparameters across models (Table \ref{['tab:runtime_matched_hparams']}); DysonNet+ABACUS is approximately flat in $N$ and at $N=1000$ is $230\times$ faster than ViT and $16\times$ faster than DysonNet. (c) Single-spin-flip proposal time under the same matched setting; only ABACUS remains approximately constant in $N$, giving a $99\times$ speedup vs ViT and $8.3\times$ vs DysonNet at $N=1000$. RBM baselines have smaller constants than ViT at moderate $N$ but scale worse than ABACUS-accelerated updates. All results use $J=-5$, $\alpha=6$, 1024 samples, TF32 precision, and an NVIDIA Tesla P100.
  • Figure 4: Optimal window exponent for the screened typewriter sampler. We plot $1-\beta_c$ versus the interaction–range exponent $\alpha$, where the optimal spacing scales as $w(N)\sim N^{\beta_c}$ and the expected number of accepted flips per sweep scales as $\mathbb{E}[M]\sim N^{1-\beta_c}$. The dashed line marks the crossover at $\alpha=1$. For non–resummable power–law tails $0<\alpha<1$ one finds $\beta_c=(3-\alpha)/3$ (hence $1-\beta_c=\alpha/3$, left annotation). For resummable tails $\alpha>1$ one obtains $\beta_c=2/(2+\alpha)$ (so $1-\beta_c=\alpha/(2+\alpha)$, right annotation). These exponents imply the per–flip runtime scalings quoted in the text, $T_{\text{per flip}}=\mathcal{O}(N^{\beta_c}\log N)$.
  • Figure 5: Scaling of average sampler throughput with system size $N$ for interaction range $\alpha=1.5$. The dashed black line indicates the theoretical maximum throughput $N/2W$ in the ideal limit of zero screening. Colored symbols represent different coupling strengths $J$. The throughput scales approximately linearly, demonstrating efficient amortization of the $\mathcal{O}(N \log N)$ update cost. Deviations from the ideal line at large $N$ signal the onset of premature freezing, an effect that is most pronounced near the critical point ($J_c \approx -1.4$) due to enhanced long-range correlations.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Theorem 1: ABACUS exactness and complexity
  • Definition 2: HODLR matrix
  • Proposition 3: Sum of exponentials $\Rightarrow$ exact low-rank off-diagonal blocks
  • proof
  • Remark 4: Damped oscillations / complex poles