Training robust and generalizable quantum models

Julian Berberich; Daniel Fink; Daniel Pranjić; Christian Tutschku; Christian Holm

Training robust and generalizable quantum models

Julian Berberich, Daniel Fink, Daniel Pranjić, Christian Tutschku, Christian Holm

TL;DR

Robustness against data perturbations and generalization in quantum machine learning are addressed by deriving parameter-dependent Lipschitz bounds for quantum models with trainable encodings and regularizing the encoding parameters during training. The authors establish a non-uniform generalization bound that explicitly involves the data-encoding parameters and show that smaller encoding norms and observable magnitudes improve generalization, with a sweet spot for the regularization strength that balances expressivity and robustness. Empirical results on a circle classification task demonstrate that trainable encodings with Lipschitz-bound regularization outperform fixed-encoding baselines in both robustness and generalization. The work provides a practical framework for improving robustness and generalization in quantum classifiers and underscores the value of trainable data encodings for NISQ-era quantum learning.

Abstract

Adversarial robustness and generalization are both crucial properties of reliable machine learning models. In this paper, we study these properties in the context of quantum machine learning based on Lipschitz bounds. We derive parameter-dependent Lipschitz bounds for quantum models with trainable encoding, showing that the norm of the data encoding has a crucial impact on the robustness against data perturbations. Further, we derive a bound on the generalization error which explicitly involves the parameters of the data encoding. Our theoretical findings give rise to a practical strategy for training robust and generalizable quantum models by regularizing the Lipschitz bound in the cost. Further, we show that, for fixed and non-trainable encodings, as those frequently employed in quantum machine learning, the Lipschitz bound cannot be influenced by tuning the parameters. Thus, trainable encodings are crucial for systematically adapting robustness and generalization during training. The practical implications of our theoretical findings are illustrated with numerical results.

Training robust and generalizable quantum models

TL;DR

Abstract

Paper Structure (13 sections, 41 equations, 7 figures)

This paper contains 13 sections, 41 equations, 7 figures.

Introduction
Quantum models and their Lipschitz bounds
Robustness of quantum models
Generalization of quantum models
Benefits of trainable encodings
Conclusion
Lipschitz bounds of quantum models
Simple Lipschitz bound based on concatenation
Proof that \ref{['eq:thm_lipschitz']} is a Lipschitz bound
Full version and proof of Theorem \ref{['thm:generalization_bound']}
Numerics: setup and further results
Numerical setup
Regularization in quantum models with fixed encoding

Figures (7)

Figure 1: Schematic illustration of the quantum model and training setup considered in this work for an exemplary Fashion MNIST data set xiao2017fashionmnist. The data, $x$, enter the quantum circuit via a trainable encoding, i.e., they are encoded into unitary operators $U_{j,\Theta_j}(x)$ via an affine function $w_j^\top x+\theta_j$ with trainable parameters $w_j$, $\theta_j$. During training, we minimize a cost function consisting of the empirical loss as well as an additional regularization term penalizing the norms of the parameters $w_j$. This regularization reduces the Lipschitz bound of the quantum model w.r.t. data perturbations and, thereby, encourages improved robustness and generalization properties.
Figure 2: We compare robustness of quantum models trained via \ref{['eq:training_opt_loss_regularized']} for $\lambda\in\{0,0.2,0.5\}$ and a quantum model with fixed encoding \ref{['eq:quantum_model_fixed_encoding']}. As training and test set, we draw $n=200$ and $1000$ points $x_i \in \mathcal{X}$, respectively, uniformly at random. To study robustness, we perturb each of the $1,000$ test data points by random noise drawn uniformly from $[-\bar{\varepsilon},+\bar{\varepsilon}]^d \ (d=2)$. The test accuracy in the plot is the worst case over $200$ noise samples per data point.
Figure 3: Results for the generalization simulations. The training setup is identical to the robustness simulations as described in Figure \ref{['exp:robustness']}. As test set, we draw $10.000$ points uniformly at random and evaluate the trained models with different regularization parameter $\lambda$.
Figure 4: Circuit representation of the quantum model \ref{['eq:quantum_model_fixed_encoding']} with fixed encoding.
Figure 5: From left to right, top to bottom: Illustration of the ground truth of the circle classification problem, the decision boundary for the fixed encoding model with regularization parameter $\lambda_f = 0.0$, the trainable encoding model with $\lambda_t = 0.15$ and with $\lambda_t = 0.0$. For the plot, we took the models with the lowest cost over all runs and epochs. Furthermore, the small circles denote the $200$ training points.
...and 2 more figures

Theorems & Definitions (2)

proof
proof

Training robust and generalizable quantum models

TL;DR

Abstract

Training robust and generalizable quantum models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (2)