Table of Contents
Fetching ...

When can classical neural networks represent quantum states?

Tai-Hsuan Yang, Mehdi Soleimanifar, Thiago Bergamaschi, John Preskill

TL;DR

This work provides an information-theoretic lens for understanding when classical neural networks can efficiently represent quantum states. By analyzing measurement-induced conditional correlations via conditional mutual information, the authors connect state entanglement, sign structure, and measurement basis to the tractability of neural quantum states, proving that short-range correlations enable shallow representations while long-range correlations pose challenges. They prove formal results for states with approximate conditional independence, and they illustrate these ideas with numerical studies of prototypical spin systems and rotated cluster states, linking correlation length to variational Monte Carlo performance. The findings offer a principled framework for predicting the success or failure of neural approaches to simulate quantum systems and guide the design of architectures for regimes with complex correlation patterns.

Abstract

A naive classical representation of an n-qubit state requires specifying exponentially many amplitudes in the computational basis. Past works have demonstrated that classical neural networks can succinctly express these amplitudes for many physically relevant states, leading to computationally powerful representations known as neural quantum states. What underpins the efficacy of such representations? We show that conditional correlations present in the measurement distribution of quantum states control the performance of their neural representations. Such conditional correlations are basis dependent, arise due to measurement-induced entanglement, and reveal features not accessible through conventional few-body correlations often examined in studies of phases of matter. By combining theoretical and numerical analysis, we demonstrate how the state's entanglement and sign structure, along with the choice of measurement basis, give rise to distinct patterns of short- or long-range conditional correlations. Our findings provide a rigorous framework for exploring the expressive power of neural quantum states.

When can classical neural networks represent quantum states?

TL;DR

This work provides an information-theoretic lens for understanding when classical neural networks can efficiently represent quantum states. By analyzing measurement-induced conditional correlations via conditional mutual information, the authors connect state entanglement, sign structure, and measurement basis to the tractability of neural quantum states, proving that short-range correlations enable shallow representations while long-range correlations pose challenges. They prove formal results for states with approximate conditional independence, and they illustrate these ideas with numerical studies of prototypical spin systems and rotated cluster states, linking correlation length to variational Monte Carlo performance. The findings offer a principled framework for predicting the success or failure of neural approaches to simulate quantum systems and guide the design of architectures for regimes with complex correlation patterns.

Abstract

A naive classical representation of an n-qubit state requires specifying exponentially many amplitudes in the computational basis. Past works have demonstrated that classical neural networks can succinctly express these amplitudes for many physically relevant states, leading to computationally powerful representations known as neural quantum states. What underpins the efficacy of such representations? We show that conditional correlations present in the measurement distribution of quantum states control the performance of their neural representations. Such conditional correlations are basis dependent, arise due to measurement-induced entanglement, and reveal features not accessible through conventional few-body correlations often examined in studies of phases of matter. By combining theoretical and numerical analysis, we demonstrate how the state's entanglement and sign structure, along with the choice of measurement basis, give rise to distinct patterns of short- or long-range conditional correlations. Our findings provide a rigorous framework for exploring the expressive power of neural quantum states.

Paper Structure

This paper contains 38 sections, 23 theorems, 129 equations, 7 figures.

Key Result

Theorem 2.2

Consider $n$ qubits arranged on a $D_{\Lambda}$-dimensional lattice $\Lambda$ with a quantum state $\ket{\psi}=\sum_{x \in \{0,1\}^n} \psi(x) \ket{x}$. Suppose the measurement distribution $p(x) = |\psi(x)|^2$ in this basis satisfies the approximate conditional independence property given in Definit that computes a function $q(x)$ such that $\sum_{x}|q(x)-p(x)|\leq 1/\operatorname{poly}(n)$. This

Figures (7)

  • Figure 1: Neural representation of$\bm{p(x) = |\psi(x)|^2}$. (a) A feedforward neural network with $n$ input nodes and specified depth and width. (b) A recurrent neural network involves multiple recurrent cells, each efficiently computing a conditional distribution using a feedforward neural network and a hidden state, which is passed to the next cell. This allows for an autoregressive sampling from the joint distribution $p(x_1,\dots, x_n)$ using the chain rule \ref{['eq:chianRule']}. Here $h_1,\dots, h_{n-1}$ are the hidden states, $x_0$ is the start token initialized as zero, and $x_1,\dots, x_n$ are inputs bits.
  • Figure 2: Partitioning the lattice. Geometrically contiguous regions $(\mathsf{A},\mathsf{B},\mathsf{C})$ needed for the neural network construction in Theorem \ref{['thm:CMI_line_2d']} in (a) one-dimensional, and (b) two-dimensional lattices. In each example, a contiguous part of the lattice has been traced out and a tripartition of the remainder is considered such that $\mathsf{A} \cap \mathsf{C} = \emptyset$. The approximate conditional independence in Definition \ref{['def:ApproximateConditionalIndependence']} is assumed to hold for such tripartitions.
  • Figure 3: Conditional correlations in shallow quantum circuits via entanglement swapping.(a) The decomposition of a one-dimensional shallow circuit with the brickwork architecture into backward lightcones in green and forward lightcones in orange. (b) The conditional correlations generated between subsystems $\mathsf{L_1}$ and $\mathsf{R_3}$ due to computational basis measurements on subsystems $\mathsf{R_1}$, $\mathsf{L_2}$, $\mathsf{R_2}$, and $\mathsf{L_3}$ can be understood as two rounds of entanglement swapping measurements.
  • Figure 4: Conditional independence vs. sign structure in tensor networks:(a) The decay of conditional mutual information (CMI) \ref{['eq:ConditionalIndependence']} in random MPS over $n=15$ qubits is demonstrated. The MPS tensor entries are real numbers drawn independently from the normal distribution $\mathcal{N}(\mu, 1)$ with mean $\mu$. For $\mu \simeq 0$, where the resulting amplitudes likely have varying signs $\psi(x)/|\psi(x)|$, increasing the bond dimension leads to longer-range conditional correlations. For larger $\mu\gg 0$, where more amplitudes are positive, correlations remain short-range, eventually with shorter correlation lengths at larger bond dimensions due to concentration effects. (b) A similar behavior is observed in random PEPS on a $5 \times 5$ qubit grid. For a fixed bond dimension of $5$, the CMI decays exponentially when $\mu \gg 0$. However, for small $\mu \simeq 0$, the CMI does not decay, resulting in long-range conditional correlations. The MPS and PEPS plots show the average results over $200$ and $10$ samples per bond dimension, respectively.
  • Figure 5: Conditional independence in prototypical spin systems: We observe the decay of conditional mutual information \ref{['eq:ConditionalIndependence']} as a function of distance $\mathrm{dist}(\mathsf{A}, \mathsf{C})$ in the disordered phase of three gapped Hamiltonians with a unique ground state. (a) 1D transverse-field Ising model $H_{\text{Ising}} = J \sum_{i\in[n]} X_i X_{i+1} + h \sum_i Z_i$ with $J=1$ and $\mathsf{A}$ a single qubit. (b) Heisenberg ladder with cross interactions: $H_{\text{ladder}}= J_{\parallel} \sum_{i\in[n-1],j\in[2]} S_{i,j}\cdot S_{i+1,j} + J_{\perp}\sum_{i\in[n]} S_{i,1}\cdot S_{i,2} + J_{\times} \sum_{i\in[n-1]} (S_{i,1}\cdot S_{i+1,2} + S_{i,2} \cdot S_{i+1,1})$ with $J_{\parallel} = 1$, $J_{\times} = 0.1$, and $\mathsf{A}$ consists of two qubits. (c) Rydberg atoms on a 2D lattice: $H_{\text{Rydberg}} = \sum_{i<j} \frac{C}{4 {|\space|r_i-r_j|\space|} ^6}(\mathds{1}+Z_i)(\mathds{1}+Z_j)-\frac{\delta}{2} \sum_{i=1}^n (\mathds{1}+Z_i)-\frac{\Omega}{2}\sum_{i=1}^n X_i$ with $\Omega = 1.0$, $R_b = 1.2$, $C = \Omega \cdot R_b^6$, detuning parameter $\delta$, and $\mathsf{A}$ being a single qubit.
  • ...and 2 more figures

Theorems & Definitions (48)

  • Definition 2.1: Approximate conditional independence in measurement distribution
  • Theorem 2.2: Neural quantum state from conditional independence
  • Theorem 2.3: Conditional independence in random shallow 1D circuits
  • Theorem 2.4: Approximate conditional independence implies area law
  • Theorem 3.1: Decay of CMI in rotated cluster state
  • Theorem B.1
  • proof
  • Remark B.2
  • Theorem B.3: Restatement of Theorem \ref{['thm:CMI_line_2d']}
  • proof
  • ...and 38 more