Generalization Bounds in Hybrid Quantum-Classical Machine Learning Models
Tongyan Wu, Amine Bentellis, Alona Sakhnenko, Jeanette Miriam Lorenz
TL;DR
This work addresses the theoretical question of how generalization scales in hybrid quantum-classical machine learning models. It develops a unified framework using covering numbers and Dudley’s entropy integral to bound the generalization error of hybrid QMLMs and derives a decomposed bound that separates quantum and classical contributions. The main result is a bound of the form $\tilde{O}\left(\frac{α^{k}}{\sqrt{N}}\left(k^{3/2}\sqrt{m n}+\sqrt{T\log T}\right)\right)$, highlighting how the number of trainable quantum gates $T$, the classical network depth $k$, and the Frobenius-norm bound $α$ interact with data size $N$ and output dimension $n$. This provides theoretical guidance for designing hybrid architectures that balance quantum circuit depth and classical expressivity while preserving strong generalization, though it also notes limitations in current bounds and the need for empirical validation to guide optimal quantum/classical resource allocation.
Abstract
Hybrid classical-quantum models aim to harness the strengths of both quantum computing and classical machine learning, but their practical potential remains poorly understood. In this work, we develop a unified mathematical framework for analyzing generalization in hybrid models, offering insight into how these systems learn from data. We establish a novel generalization bound of the form $\tilde{\mathcal O}\left( \tfrac{α^{k}}{\sqrt{N}}\, \big( k^{\tfrac{3}{2}}\sqrt{m n}\;+\;\sqrt{T\log T}\big) \right)$ for $N$ training data points, $T$ trainable quantum gates, $n$ dimensional quantum circuit output, and $k$ bounded linear layers $ \|F_i\|_F \leq α$ where $ i = 1, \dots, k $ and $F_i \in \mathbb{R}^{m \times n} $ interspersed with activation functions. This generalization bound decomposes into quantum and classical contributions, providing a theoretical framework to separate their influence and clarifying their interaction. Alongside the bound, we highlight conceptual limitations of applying classical statistical learning theory in the hybrid setting and suggest promising directions for future theoretical work.
