Table of Contents
Fetching ...

Fragmentation is Efficiently Learnable by Quantum Neural Networks

Mikhail Mints, Eric R. Anschuetz

TL;DR

The paper studies learning the Schur transform in Hilbert-space-fragmented quantum systems using quantum neural networks (QNNs). By restricting to a polynomially-sized Krylov subspace structure and leveraging a sectorized ETH framework, it shows gradient-based training can avoid barren plateaus and enter an overparameterized regime, enabling efficient learnability of the Schur transform from training Schur-basis states. It argues that no efficient classical dequantization is known for this task due to the unknown, potentially non-sparse algebraic structure, and supports these claims with a detailed formal proof and numerical demonstrations on Temperley-Lieb fragmentation. The work highlights a physically motivated quantum learning scenario with potential experimental relevance and invites further exploration of classical simulation limitations in symmetric quantum systems.

Abstract

Hilbert space fragmentation is a phenomenon in which the Hilbert space of a quantum system is dynamically decoupled into exponentially many Krylov subspaces. We can define the Schur transform as a unitary operation mapping some set of preferred bases of these Krylov subspaces to computational basis states labeling them. We prove that this transformation can be efficiently learned via gradient descent from a set of training data using quantum neural networks, provided that the fragmentation is sufficiently strong such that the summed dimension of the unique Krylov subspaces is polynomial in the system size. To demonstrate this, we analyze the loss landscapes of random quantum neural networks constructed out of Hilbert space fragmented systems. We prove that in this setting, it is possible to eliminate barren plateaus and poor local minima, suggesting efficient trainability when using gradient descent. Furthermore, as the algebra defining the fragmentation is not known a priori and not guaranteed to have sparse algebra elements, to the best of our knowledge there are no existing efficient classical algorithms generally capable of simulating expectation values in these networks. Our setting thus provides a rare example of a physically motivated quantum learning task with no known dequantization.

Fragmentation is Efficiently Learnable by Quantum Neural Networks

TL;DR

The paper studies learning the Schur transform in Hilbert-space-fragmented quantum systems using quantum neural networks (QNNs). By restricting to a polynomially-sized Krylov subspace structure and leveraging a sectorized ETH framework, it shows gradient-based training can avoid barren plateaus and enter an overparameterized regime, enabling efficient learnability of the Schur transform from training Schur-basis states. It argues that no efficient classical dequantization is known for this task due to the unknown, potentially non-sparse algebraic structure, and supports these claims with a detailed formal proof and numerical demonstrations on Temperley-Lieb fragmentation. The work highlights a physically motivated quantum learning scenario with potential experimental relevance and invites further exploration of classical simulation limitations in symmetric quantum systems.

Abstract

Hilbert space fragmentation is a phenomenon in which the Hilbert space of a quantum system is dynamically decoupled into exponentially many Krylov subspaces. We can define the Schur transform as a unitary operation mapping some set of preferred bases of these Krylov subspaces to computational basis states labeling them. We prove that this transformation can be efficiently learned via gradient descent from a set of training data using quantum neural networks, provided that the fragmentation is sufficiently strong such that the summed dimension of the unique Krylov subspaces is polynomial in the system size. To demonstrate this, we analyze the loss landscapes of random quantum neural networks constructed out of Hilbert space fragmented systems. We prove that in this setting, it is possible to eliminate barren plateaus and poor local minima, suggesting efficient trainability when using gradient descent. Furthermore, as the algebra defining the fragmentation is not known a priori and not guaranteed to have sparse algebra elements, to the best of our knowledge there are no existing efficient classical algorithms generally capable of simulating expectation values in these networks. Our setting thus provides a rare example of a physically motivated quantum learning task with no known dequantization.

Paper Structure

This paper contains 10 sections, 13 theorems, 128 equations, 3 figures.

Key Result

Theorem 1

If our dataset $\mathcal{D}$ contains an instance $\ket{\lambda, q_\lambda, p_\lambda}$ for each choice of $\lambda$ and $q_\lambda$, then if we take the (exponential-size) dataset $\mathcal{D}'=\left\{\ket{\lambda, q_\lambda, p_\lambda}\right\}_{\lambda,q_\lambda,p_\lambda}$ consisting of every Sch

Figures (3)

  • Figure 1: Diagram of the QNN training process. We are given a dataset of Schur basis states and randomly sample a QNN ansatz architecture and an initial vector of parameters. We then compile the QNN into a quantum circuit and repeatedly run on the input states to estimate the gradient of the loss function, which is then used to adjust the parameters.
  • Figure 2: Training curves for QNN models with $1$, $5$, $10$, $20$, and $40$ parameters on the $4$-qubit Temperley-Lieb dataset (a) and the $8$-qubit Temperley-Lieb dataset (b). For each number of parameters, $10$ QNNs were randomly initialized and trained for $200$ epochs with a learning rate of $0.1$. Training was stopped if the loss was decaying slower than $5\%$ every $5$ epochs or if it reached less than $0.01$. The faded lines show the individual training curves, and the solid lines show their averages. For these diagrams, the loss value is adjusted to scale from $0$ to $1$ instead of from $-1$ to $0$.
  • Figure 3: Distribution of local minima reached during training for the $4$-qubit Temperley-Lieb dataset. For this plot, $1000$ QNNs were randomly initialized for each number of parameters and trained for $200$ epochs with a learning rate of $0.1$. The data plotted on the $x$-axis is the final loss achieved during training. Since we can assume that during gradient descent the parameters converge to the "closest" local minimum, this is a proxy for the distribution of the local minima of the loss landscape. We can see that with $15$ parameters, the QNN enters the overparameterized regime since the distribution of minima becomes peaked at zero.

Theorems & Definitions (29)

  • Theorem 1
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • ...and 19 more