Table of Contents
Fetching ...

Towards structure-preserving quantum encodings

Arthur J. Parzygnat, Tai-Danae Bradley, Andrew Vlasic, Anh Pham

TL;DR

The paper advocates using category theory to organize and design structure-preserving quantum encodings that map classical data $\mathcal{X}$ into quantum states $\mathcal{S}(\mathcal{H})$, via encodings $\rho: \mathcal{X}\to\mathcal{S}(\mathcal{H})$ or unitary encodings $U: \mathcal{X}\to\mathcal{U}(\mathcal{H})$. By pairing data with mathematical structures (e.g., symmetry, topology, metric) and viewing encodings as morphisms in categories, the authors show how forgetful functors lift problems to structure-rich contexts, thereby reducing the search space for good encodings. They illustrate this framework through concrete structure types: symmetry via $G$-equivariant encodings, continuity and smoothness via topologies, and distance notions via topological data analysis and metric learning, including exact distance-preserving results and semi-metric considerations when injectivity fails. The approach provides a rigorous, unified language to pose design questions, benchmarks, and open problems, with potential to guide principled encoding choices and clarify when quantum advantages survive through the encoding layer.

Abstract

Harnessing the potential computational advantage of quantum computers for machine learning tasks relies on the uploading of classical data onto quantum computers through what are commonly referred to as quantum encodings. The choice of such encodings may vary substantially from one task to another, and there exist only a few cases where structure has provided insight into their design and implementation, such as symmetry in geometric quantum learning. Here, we propose the perspective that category theory offers a natural mathematical framework for analyzing encodings that respect structure inherent in datasets and learning tasks. We illustrate this with pedagogical examples, which include geometric quantum machine learning, quantum metric learning, topological data analysis, and more. Moreover, our perspective provides a language in which to ask meaningful and mathematically precise questions for the design of quantum encodings and circuits for quantum machine learning tasks.

Towards structure-preserving quantum encodings

TL;DR

The paper advocates using category theory to organize and design structure-preserving quantum encodings that map classical data into quantum states , via encodings or unitary encodings . By pairing data with mathematical structures (e.g., symmetry, topology, metric) and viewing encodings as morphisms in categories, the authors show how forgetful functors lift problems to structure-rich contexts, thereby reducing the search space for good encodings. They illustrate this framework through concrete structure types: symmetry via -equivariant encodings, continuity and smoothness via topologies, and distance notions via topological data analysis and metric learning, including exact distance-preserving results and semi-metric considerations when injectivity fails. The approach provides a rigorous, unified language to pose design questions, benchmarks, and open problems, with potential to guide principled encoding choices and clarify when quantum advantages survive through the encoding layer.

Abstract

Harnessing the potential computational advantage of quantum computers for machine learning tasks relies on the uploading of classical data onto quantum computers through what are commonly referred to as quantum encodings. The choice of such encodings may vary substantially from one task to another, and there exist only a few cases where structure has provided insight into their design and implementation, such as symmetry in geometric quantum learning. Here, we propose the perspective that category theory offers a natural mathematical framework for analyzing encodings that respect structure inherent in datasets and learning tasks. We illustrate this with pedagogical examples, which include geometric quantum machine learning, quantum metric learning, topological data analysis, and more. Moreover, our perspective provides a language in which to ask meaningful and mathematically precise questions for the design of quantum encodings and circuits for quantum machine learning tasks.

Paper Structure

This paper contains 12 sections, 3 theorems, 33 equations, 9 figures.

Key Result

Theorem 35

Let $(\mathcal{X},d_\mathcal{X})$ and $(\mathcal{Y},d_\mathcal{Y})$ be two metric spaces, let $X\subseteq\mathcal{X}$ be a finite subset equipped with the induced metric $d_{X}$ from $d_{\mathcal{X}}$. If $f:(X,d_{X})\to(\mathcal{Y},d_{\mathcal{Y}})$ is an embedding, let $Y:=f(X)$ be the image of $X

Figures (9)

  • Figure 1: The three essential components of a (variational) quantum machine-learning task or algorithm: the encoding block that transfers classical data $x$ onto the quantum computer, the variational block whose parameters $\theta$ can be modified so as to optimize some outcome, and the measurement. Our focus here is on the encoding step.
  • Figure 2: Among all possible set-theoretic functions describing quantum state encodings (dashed arrows) from a data domain $\mathcal{X}$ to a Hilbert space $\mathcal{H}$, demanding that a structure is preserved isolates a subset of quantum encodings (solid arrows), thus potentially simplifying the search for quantum encodings compatible with a given structure.
  • Figure 3: This is a variation of Figure 2 in Ref. meyer2023exploiting based on a binary classification task with discrete symmetries given by $\alpha_{(1,0)}$ a reflection along the line $x_{2}=x_{1}$, $\alpha_{(0,1)}$ an inversion, and $\alpha_{(1,1)}$ a reflection along the line $x_{2}=-x_{1}$. The structure is periodic so that the symmetry is preserved. More details about how to explicitly construct the decision boundaries for this classifier, as well as the associated observable, are provided in Appendix \ref{['app:GQML']}.
  • Figure 4: The standard form of amplitude encoding is smooth but not dimension-preserving. This is because it requires a step that pushes the set of data points in ${{\mathbb R}}^{2^{d}}$ onto the unit sphere $S^{2^{d}-1}$ thereby reducing the separation between the data points and hence increasing the difficulty in distinguishing between them. For example, although the square $\blacksquare$ is close to the diamond $\blacklozenge$ in the ambient data domain, they are far apart after amplitude encoding. Meanwhile, although $\blacktriangledown$ and $\blacktriangle$ are far apart in the ambient data domain, they are close together after amplitude encoding.
  • Figure 5: Although data $x_0$ in the interval $[-1,1]$ gets mapped into the northern hemisphere of $S^{1}$ within polar angle $\frac{\pi}{4}$, all other data gets mapped to the region with polar angle between $\frac{\pi}{4}$ and $\frac{\pi}{2}$. This is illustrated by showing how the set of negative integers becomes a sequence with a limit point at $(-1,0)$ and similarly the image of the positive integers has a limit point at $(1,0)$. This is shown here for $[-6,6]\cap{{\mathbb Z}}$.
  • ...and 4 more figures

Theorems & Definitions (25)

  • Definition 1
  • Definition 3
  • Definition 5
  • Definition 7
  • Example 9
  • Example 26: Angle encoding
  • Example 29: Amplitude encoding
  • Definition 33
  • Definition 34
  • Theorem 35
  • ...and 15 more