Table of Contents
Fetching ...

Connectivity determines the capability of sparse neural network quantum states

Brandon Barton, Juan Carrasquilla, Christopher Roth, Agnes Valenti

TL;DR

This work extends the Lottery Ticket Hypothesis to neural-network quantum states for unsupervised ground-state discovery in quantum many-body systems. It demonstrates that sparse subnetworks with as little as 5–20% of the dense parameters can match dense-model performance across the transverse-field Ising model and the toric code, with performance governed by connectivity rather than initialization. The study uncovers universal scaling laws for sparse networks, reveals a sparsity-induced first-order phase transition, and provides an interpretable, asymptotically exact sparse solution to the toric code via specific odd-parity filters. These findings offer a principled route to compact, physically meaningful quantum-state representations and open avenues for efficient simulation and analysis of complex quantum systems.

Abstract

The Lottery Ticket Hypothesis (LTH) posits that within overparametrized neural networks, there exist sparse subnetworks that are capable of matching the performance of the original model when trained in isolation from the original initialization. We extend this hypothesis to the unsupervised task of approximating the ground state of quantum many-body Hamiltonians, a problem equivalent to finding a neural-network compression of the lowest-lying eigenvector of an exponentially large matrix. Focusing on two representative quantum Hamiltonians, the transverse field Ising model (TFIM) and the toric code (TC), we demonstrate that sparse neural networks can reach accuracies comparable to their dense counterparts, even when pruned by more than an order of magnitude in parameter count. Crucially, and unlike the original LTH, we find that performance depends only on the structure of the sparse subnetwork, not on the specific initialization, when trained in isolation. Moreover, we identify universal scaling behavior that persists across network sizes and physical models, where the boundaries of scaling regions are determined by the underlying Hamiltonian. At the onset of high-error scaling, we observe signatures of a sparsity-induced quantum phase transition that is first-order in shallow networks. Finally, we demonstrate that pruning enhances interpretability by linking the structure of sparse subnetworks to the underlying physics of the Hamiltonian.

Connectivity determines the capability of sparse neural network quantum states

TL;DR

This work extends the Lottery Ticket Hypothesis to neural-network quantum states for unsupervised ground-state discovery in quantum many-body systems. It demonstrates that sparse subnetworks with as little as 5–20% of the dense parameters can match dense-model performance across the transverse-field Ising model and the toric code, with performance governed by connectivity rather than initialization. The study uncovers universal scaling laws for sparse networks, reveals a sparsity-induced first-order phase transition, and provides an interpretable, asymptotically exact sparse solution to the toric code via specific odd-parity filters. These findings offer a principled route to compact, physically meaningful quantum-state representations and open avenues for efficient simulation and analysis of complex quantum systems.

Abstract

The Lottery Ticket Hypothesis (LTH) posits that within overparametrized neural networks, there exist sparse subnetworks that are capable of matching the performance of the original model when trained in isolation from the original initialization. We extend this hypothesis to the unsupervised task of approximating the ground state of quantum many-body Hamiltonians, a problem equivalent to finding a neural-network compression of the lowest-lying eigenvector of an exponentially large matrix. Focusing on two representative quantum Hamiltonians, the transverse field Ising model (TFIM) and the toric code (TC), we demonstrate that sparse neural networks can reach accuracies comparable to their dense counterparts, even when pruned by more than an order of magnitude in parameter count. Crucially, and unlike the original LTH, we find that performance depends only on the structure of the sparse subnetwork, not on the specific initialization, when trained in isolation. Moreover, we identify universal scaling behavior that persists across network sizes and physical models, where the boundaries of scaling regions are determined by the underlying Hamiltonian. At the onset of high-error scaling, we observe signatures of a sparsity-induced quantum phase transition that is first-order in shallow networks. Finally, we demonstrate that pruning enhances interpretability by linking the structure of sparse subnetworks to the underlying physics of the Hamiltonian.

Paper Structure

This paper contains 40 sections, 21 equations, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: Test of the LTH at the critical point $\kappa = \kappa_c$ of the TFIM on the $N=10\times10$ lattice, for various neural network architectures: (a) shallow CNN, (b) shallow FFNN, and (c) Res-CNN (shaded regions are statistical sampling errors). The relative error is computed with respect to the average energy of three best-performing dense ResCNNs by smallest variance (see Appendix).
  • Figure 2: Plotted are regions of error scaling within IMP-WR for the TFIM on the $N=10 \times 10$ lattice using FFNNs with depth $d=1$. The subplots correspond to simulations at (a) fixed $\kappa$ (at the critical point $\kappa=\kappa_c$) with varying network widths $w=\alpha N$ (b) varying $\kappa<\kappa_c$ within the ferromagnetic phase and fixed width $w = 5N$ (b) varying $\kappa>\kappa_c$ within the paramagnetic phase and fixed width $w = 5N$. We qualitatively highlight the scaling regions via colors in (a) and dashed lines in (b) and (c).
  • Figure 3: Evidence of a sparsity-induced quantum phase transition in shallow depth $d=1$ FFNNs. This data corresponds to finite size scaling at the quantum critical point $\kappa = \kappa_c$ of the TFIM for $N=4\times 4$ through $N=10 \times 10$ spins. We plot the (a) relative error per spin versus the number of parameters per spin (b) fidelity between neighboring pruned wave functions, (c,d) the total magnetization in the $x$ and $z$ directions. The phase transition is marked via a dashed line in (a,c,d).
  • Figure 4: (a) LTH test for the toric code with $N=18$ spins using a depth $d=1$ FFNN of width $w=8 N$. (b) Fidelity between neighboring pruned NQS with dashed line at critical number of parameters $\rho_c = 16$. (c) Schematic of a single hidden neuron connected to a single plaquette. (d) Normalized weights and their sign configuration from eight single hidden neurons in a pruned FFNN.
  • Figure 5: Comparison of iterative pruning variants: weight rewinding (IMP-WR), continued training (IMP-CT), and random pruning with weight rewinding (IRP-WR). This data corresponds to the TFIM on a $N=10\times10$ lattice at the critical point ($\kappa = \kappa_c$). We plot the relative error of these three variants starting from the same dense (a) FFNN $(d=1, w=5N)$, and (c) CNN $(d=1, n_f=4)$. In (b), we show the energy per spin over the total training steps in 65 pruning iterations from the shallow FFNN in (a).
  • ...and 3 more figures