Information-theoretic generalization bounds for learning from quantum data

Matthias Caro; Tom Gur; Cambyse Rouzé; Daniel Stilck França; Sathyawageeswar Subramanian

Information-theoretic generalization bounds for learning from quantum data

Matthias Caro, Tom Gur, Cambyse Rouzé, Daniel Stilck França, Sathyawageeswar Subramanian

TL;DR

This work develops a unified information-theoretic framework for learning from data that is partly classical and partly quantum. By modeling quantum learners as channels acting on classical-quantum data and introducing loss observables, the authors derive generalization bounds that decompose into classical and quantum information contributions, specifically involving mutual information, Holevo information, and their Fenchel-Legendre transforms. Under sub-Gaussian moment generating function assumptions, the bounds yield explicit rates and recover classical results as special cases, while also applying to diverse quantum tasks such as PAC learning of quantum states, entangled-data scenarios, and quantum state discrimination. The framework thus provides a principled, unifying lens for quantum learning and offers pathways to analyze privacy, stability, and inductive learning in quantum settings with potential connections to quantum optimal transport and concentration phenomena.

Abstract

Learning tasks play an increasingly prominent role in quantum information and computation. They range from fundamental problems such as state discrimination and metrology over the framework of quantum probably approximately correct (PAC) learning, to the recently proposed shadow variants of state tomography. However, the many directions of quantum learning theory have so far evolved separately. We propose a general mathematical formalism for describing quantum learning by training on classical-quantum data and then testing how well the learned hypothesis generalizes to new data. In this framework, we prove bounds on the expected generalization error of a quantum learner in terms of classical and quantum information-theoretic quantities measuring how strongly the learner's hypothesis depends on the specific data seen during training. To achieve this, we use tools from quantum optimal transport and quantum concentration inequalities to establish non-commutative versions of decoupling lemmas that underlie recent information-theoretic generalization bounds for classical machine learning. Our framework encompasses and gives intuitively accessible generalization bounds for a variety of quantum learning scenarios such as quantum state discrimination, PAC learning quantum states, quantum parameter estimation, and quantumly PAC learning classical functions. Thereby, our work lays a foundation for a unifying quantum information-theoretic perspective on quantum learning.

Information-theoretic generalization bounds for learning from quantum data

TL;DR

Abstract

Paper Structure (35 sections, 10 theorems, 160 equations, 4 figures, 2 tables)

This paper contains 35 sections, 10 theorems, 160 equations, 4 figures, 2 tables.

Introduction
Main results
Unified information-theoretic framework
Learners as maps.
Risk for classical learners.
Risk for quantum learners.
Generalization error bounds
Assumptions.
Generalization bounds.
Applications
PAC learning quantum states.
Quantum PAC learning from entangled data.
Quantum state discrimination and classification.
Discussion and outlook
Average-case vs. worst-case.
...and 20 more sections

Key Result

Theorem 1

If the classical-quantum data state $\rho$ and the loss observable satisfy eqinf:qmgf and eqinf:cmgf, then the expected generalization error of $\mathcal{A}$ satisfies where $\psi_\mp^{\ast -1}$ and $\phi_\mp^{\ast -1}$ denote the inverses of the Legendre transforms of $\psi_\mp$ and $\phi_{\mp}$.

Figures (4)

Figure 1: Framework for learning from classical-quantum data: The quantum learner $\mathcal{A}$ acts on the classical data and on the training subsystem of the quantum data via a measurement followed by classical and quantum post-processing. The performance of the resulting classical and quantum hypotheses are then evaluated via a loss measurement that also takes the testing subsystem of the quantum data into account. The training and testing subsystems may initially be correlated or even entangled.
Figure 2: Xu2017's classical framework: The expected empirical and true risk of a classical learner differ only in whether the training data and hypothesis are correlated or not. Decoupling the two leads to a generalization bound in terms of the MI $I(S;W)$.
Figure 3: Extended Xu2017 framework for classical learners with test data: When taking test data into account, the expected empirical and true risk differ in whether training data and test data are correlated or decoupled and whether training data and hypothesis are correlated or decoupled. Thus, the expected generalization error can be bounded in terms of the MI quantities $I(S_{\mathrm{tr}};S_{\mathrm{te}})$ and $I(S_{\mathrm{tr}}; W)$. Note that the resulting expected risks in three out of the four cells coincide.
Figure 4: Framework for quantum learners: In our formalization of learning from CQ data, going from expected empirical to true risk requires decoupling the quantum training and test data as well as the classical hypothesis and classical training data. This leads to generalization bounds involving an average QMI plus Holevo information term and a classical MI term.

Theorems & Definitions (49)

Theorem 1: Classical and quantum information-theoretic generalization bound. Informally stated; see \ref{['theorem:qmi-gen-bound-qmgf-and-cmgf']}
Corollary 2: Informally stated; see \ref{['corollary:qmi-gen-bound-qmgf-and-cmgf-subgaussian']}
Corollary 3: Informally stated; see \ref{['corollary:qmi-gen-bound-qmgf-and-cmgf-subgaussian-independent']}
Definition 3.1: Classical-Quantum (CQ) States
Definition 3.2: Quantum relative entropy
Definition 3.3: Quantum mutual information
Definition 3.4: Holevo information
Definition 3.5: POVMs and post-measurement states
Definition 3.6: Quantum channels -- Schrödinger picture
Definition 3.7: Quantum Wasserstein-$1$ distance depalma2021quantumwasserstein
...and 39 more

Information-theoretic generalization bounds for learning from quantum data

TL;DR

Abstract

Information-theoretic generalization bounds for learning from quantum data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (49)