Table of Contents
Fetching ...

Theory for Equivariant Quantum Neural Networks

Quynh T. Nguyen, Louis Schatzki, Paolo Braccia, Michael Ragone, Patrick J. Coles, Frederic Sauvage, Martin Larocca, M. Cerezo

TL;DR

This work presents a comprehensive framework for Equivariant Quantum Neural Networks (EQNNs), integrating symmetry principles from geometric deep learning into quantum models. It reframes EQNN layers as generalized Fourier-space actions, enabling precise parameter counting and flexible design via intermediate representations. Three construction methods—nullspace, twirling, and Choi-operator—produce unitary and non-unitary equivariant layers that remain scalable even for large or continuous symmetry groups, and are demonstrated in SU(2)-equivariant QCNNs. Numerical results on a bond-alternating Heisenberg model show SU(2)-equivariant QCNNs can outperform symmetry-agnostic counterparts, illustrating the practical benefits of symmetry-informed quantum learning and providing a blueprint for future quantum machine learning with strong geometric priors.

Abstract

Quantum neural network architectures that have little-to-no inductive biases are known to face trainability and generalization issues. Inspired by a similar problem, recent breakthroughs in machine learning address this challenge by creating models encoding the symmetries of the learning task. This is materialized through the usage of equivariant neural networks whose action commutes with that of the symmetry. In this work, we import these ideas to the quantum realm by presenting a comprehensive theoretical framework to design equivariant quantum neural networks (EQNN) for essentially any relevant symmetry group. We develop multiple methods to construct equivariant layers for EQNNs and analyze their advantages and drawbacks. Our methods can find unitary or general equivariant quantum channels efficiently even when the symmetry group is exponentially large or continuous. As a special implementation, we show how standard quantum convolutional neural networks (QCNN) can be generalized to group-equivariant QCNNs where both the convolution and pooling layers are equivariant to the symmetry group. We then numerically demonstrate the effectiveness of a SU(2)-equivariant QCNN over symmetry-agnostic QCNN on a classification task of phases of matter in the bond-alternating Heisenberg model. Our framework can be readily applied to virtually all areas of quantum machine learning. Lastly, we discuss about how symmetry-informed models such as EQNNs provide hopes to alleviate central challenges such as barren plateaus, poor local minima, and sample complexity.

Theory for Equivariant Quantum Neural Networks

TL;DR

This work presents a comprehensive framework for Equivariant Quantum Neural Networks (EQNNs), integrating symmetry principles from geometric deep learning into quantum models. It reframes EQNN layers as generalized Fourier-space actions, enabling precise parameter counting and flexible design via intermediate representations. Three construction methods—nullspace, twirling, and Choi-operator—produce unitary and non-unitary equivariant layers that remain scalable even for large or continuous symmetry groups, and are demonstrated in SU(2)-equivariant QCNNs. Numerical results on a bond-alternating Heisenberg model show SU(2)-equivariant QCNNs can outperform symmetry-agnostic counterparts, illustrating the practical benefits of symmetry-informed quantum learning and providing a blueprint for future quantum machine learning with strong geometric priors.

Abstract

Quantum neural network architectures that have little-to-no inductive biases are known to face trainability and generalization issues. Inspired by a similar problem, recent breakthroughs in machine learning address this challenge by creating models encoding the symmetries of the learning task. This is materialized through the usage of equivariant neural networks whose action commutes with that of the symmetry. In this work, we import these ideas to the quantum realm by presenting a comprehensive theoretical framework to design equivariant quantum neural networks (EQNN) for essentially any relevant symmetry group. We develop multiple methods to construct equivariant layers for EQNNs and analyze their advantages and drawbacks. Our methods can find unitary or general equivariant quantum channels efficiently even when the symmetry group is exponentially large or continuous. As a special implementation, we show how standard quantum convolutional neural networks (QCNN) can be generalized to group-equivariant QCNNs where both the convolution and pooling layers are equivariant to the symmetry group. We then numerically demonstrate the effectiveness of a SU(2)-equivariant QCNN over symmetry-agnostic QCNN on a classification task of phases of matter in the bond-alternating Heisenberg model. Our framework can be readily applied to virtually all areas of quantum machine learning. Lastly, we discuss about how symmetry-informed models such as EQNNs provide hopes to alleviate central challenges such as barren plateaus, poor local minima, and sample complexity.
Paper Structure (67 sections, 16 theorems, 110 equations, 14 figures, 2 tables, 1 algorithm)

This paper contains 67 sections, 16 theorems, 110 equations, 14 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

A model consisting of an $(G,R^{\text{in}},R^{\text{out}})$-equivariant QNN and a $(G,R^{\text{out}})$-equivariant set of measurements is $G$-invariant.

Figures (14)

  • Figure 1: Schematic representation of our main results. a) In GQML we start by identifying the symmetry group -or groups-- that leave the data labels invariant. For the example shown, the data can be visualized on a three-dimensional sphere, and the labels are invariant under the action of $\mathbb{SO}(3)$. b) Both in classical and quantum machine learning it has been shown that models with equivariant layers often have an improved performance over non-equivariant architectures. The key feature of equivariance is that applying a rotation to the input data and sending it through the layer is the same as first sending the data through the layer and then rotating the output. On the other hand, feeding either a raw or a rotated data instance into a non-equivariant layer usually leads to distorted outputs which are not related by a rotation. c) In this work we provide a toolbox of methods for creating equivariant quantum neural networks (EQNNs) that can be readily used to construct quantum architectures with strong geometric priors.
  • Figure 2: Equivariant quantum neural network. a) We consider a QML problem composed of a dataset (that can either be quantum mechanical in nature, or corresponding to classical data that have been encoded in quantum states) as well as a label symmetry group $G$. The first step is to define the input and output representation of $G$ at each layer, where these can be natural, faithful, non-faithful, etc. From here, we will provide different techniques which allow us to construct the EQNN layers and control, for instance, the locality of their gates. b) Dashed lines indicate the representations of the symmetry group $G$ at specific stages in the EQNN, which may change between layers. At first, the input state $\rho_\text{in}$ is acted upon by the representation $R^\text{in}$. The $l$th layer of the EQNN, $\mathcal{N}^l_{\boldsymbol{\theta}_l}$, must be $(G,R^{l},R^{l+1})$-equivariant. In sum, the full architecture, $\phi = \mathcal{N}^L_{\boldsymbol{\theta}_L} \circ \cdots \circ \mathcal{N}^1_{\boldsymbol{\theta}_1}$, is $(G,R^\text{in},R^\text{out})$-equivariant. The $(G,,R^\text{out})$-equivariant measurement operator $O$ is in the commutant of the output representation $R^\text{out}$. Note that if we only want the EQNN to produce an output state equivariantly or invariantly (e.g. in generative models), we can omit the measurements.
  • Figure 3: Different types of equivariant layers in a general architecture of EQNNs. A standard layer maps data between spaces of the same dimension. An embedding (pooling) layer maps the data to a higher-dimensional (smaller-dimensional) space. In a lifting layer, $\operatorname{ker}(R^{l-1}) > \operatorname{ker}(R^{l})$, while in a projection layer $\operatorname{ker}(R^{l-1}) < \operatorname{ker}(R^{l})$.
  • Figure 4: Example of the nullspace method. We demonstrate how to use the nullspace method to determine the space of 1-to-1-qubit ($G$,$R^{\text{in}}$,$R^{\text{out}}$)-equivariant quantum channels, with $G=\mathbb{Z}_2=\{e,\sigma \}$, $R^{\text{in}}=\{\openone,X\}$ and $R^{\text{out}}=\{\openone,Z\}$. a) The matrix representation of both in and out adjoint representations of the symmetry group. b) A basis for the 8-dimensional solution space, as well as two possible equivariant channels: $\phi(\rho)=\Tr[\rho]/2$ obtained from the solution in red, and $\phi(\rho)=(X\rho X+Z\rho Z)/2$ obtained by combining the two solutions in green.
  • Figure 5: Example of the twirling method. We demonstrate how to use the twirling method to determine the space of 1-to-1 qubit ($G$,$R^{\text{in}}$,$R^{\text{out}}$)-equivariant quantum channels, with $G=\mathbb{Z}_2=\{e,\sigma \}$, $R^{\text{in}}=\{\openone,X\}$ and $R^{\text{out}}=\{\openone,Z\}$. a) Explicit calculation using the twirling formula of Eq. \ref{['eq:twirl-finite']}. b) Ancilla-based scheme for in-circuit twirling. c) Classical-randomness scheme for in-circuit twirling. Both schemes in b) and c), detailed in Appendix \ref{['app:advanced_twirling']}, recover the twirling in a).
  • ...and 9 more figures

Theorems & Definitions (35)

  • Definition 1: Label symmetries and $G$-invariance
  • Definition 2: Representation
  • Definition 3: Commutant
  • Definition 4
  • Definition 5: Inner and outer symmetries
  • Definition 6: Equivariant map
  • Definition 7: Equivariant operator
  • Proposition 1: Invariance from equivariance
  • proof
  • Theorem 1: Structure of commutant, Theorem IX.11.2 in simon1996representations
  • ...and 25 more