Table of Contents
Fetching ...

Kernel-based optimization of measurement operators for quantum reservoir computers

Markus Gross, Hans-Martin Rieser

TL;DR

This work formulate the training of both stateless and stateful QRCs in the framework of kernel ridge regression in the framework of kernel ridge regression, which renders an optimal measurement operator that minimizes prediction error for a given reservoir and training dataset.

Abstract

Finding optimal measurement operators is crucial for the performance of quantum reservoir computers (QRCs), since they employ a fixed quantum feature map. We formulate the training of both stateless (quantum extreme learning machines, QELMs) and stateful (memory dependent) QRCs in the framework of kernel ridge regression. This approach renders an optimal measurement operator that minimizes prediction error for a given reservoir and training dataset. For large qubit numbers, this method is more efficient than the conventional training of QRCs. We discuss efficiency and practical implementation strategies, including Pauli basis decomposition and operator diagonalization, to adapt the optimal observable to hardware constraints. Numerical experiments on image classification and time series prediction tasks demonstrate the effectiveness of this approach, which can also be applied to other quantum ML models.

Kernel-based optimization of measurement operators for quantum reservoir computers

TL;DR

This work formulate the training of both stateless and stateful QRCs in the framework of kernel ridge regression in the framework of kernel ridge regression, which renders an optimal measurement operator that minimizes prediction error for a given reservoir and training dataset.

Abstract

Finding optimal measurement operators is crucial for the performance of quantum reservoir computers (QRCs), since they employ a fixed quantum feature map. We formulate the training of both stateless (quantum extreme learning machines, QELMs) and stateful (memory dependent) QRCs in the framework of kernel ridge regression. This approach renders an optimal measurement operator that minimizes prediction error for a given reservoir and training dataset. For large qubit numbers, this method is more efficient than the conventional training of QRCs. We discuss efficiency and practical implementation strategies, including Pauli basis decomposition and operator diagonalization, to adapt the optimal observable to hardware constraints. Numerical experiments on image classification and time series prediction tasks demonstrate the effectiveness of this approach, which can also be applied to other quantum ML models.
Paper Structure (25 sections, 78 equations, 4 figures, 3 tables)

This paper contains 25 sections, 78 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Optimization of the measurement operator for a QELM trained on the MNIST (first $2N$ principal components) classification task (left) without and (right) with a reservoir unitary (transverse field Ising model). The dashed line represents the test accuracy obtained using the full operator $M^*$ (i.e., one per digit class), while the connected symbols correspond to the Pauli decomposition and diagonalization approaches [\ref{['eq_pauli_exp', 'eq_diag_spec']}]. The notation ord($k$) and mon($p$) refers to the observable order $k$ and monomial order $p$ in \ref{['eq_QELM_readout']}. The curve labeled 'primal: Z,ZZ+mon(4)' represents primal optimization using only the Pauli-$Z$ and -ZZ observables enhanced by readout monomials up to 4th order, while CSVM refers to a classical SVM trained on (non-)linear features constructed from the principal components. Since there is no intrinsic randomness in the model, error bars arise only from the selection of samples and are negligible here. $^\dag$Here, we optimized the kernel only over a single class (to limit combinatorial explosion in the number of readout terms), but used the resulting operator set to re-train the QELM for all classes.
  • Figure 2: Optimization of the measurement layer for a memoryless QRC without ancilla qubits (QELM), for the cases without and with a (TFIM-based) reservoir unitary. The QRC is trained on a 3-dimensional chaotic time series [\ref{['eq_Lorenz63']}]. The error bars on the forecast horizon represent the standard deviation (not error of the mean) after averaging over different initial conditions. (Their magnitude is comparable to classical RC results for chaotic systems.) The dashed line represents the forecast horizon when using the kernel-based optimal measurement operator $M^*$ directly, while the connected points correspond to decompositions into subsets of observables [\ref{['eq_pauli_exp', 'eq_diag_spec']}]. The dash-dotted line gives the forecast horizon (with its standard deviation indicated by the gray area) for the primal optimization over the complete operator basis. $E_{\text{train}}$ denotes the (RMS) training error. The horizontal axis represents the number of observables ranked by their relevance (left = least relevant; maximum number $= 64\times 3=192$ for right panel).
  • Figure 3: Optimization of the measurement operator for a QRC with internal memory ($N_A=1$ ancilla qubit coupled via a TFIM unitary to $N_I=3$ input qubits), trained to predict the Lorenz-63 time series [\ref{['eq_Lorenz63']}] encoded via amplitude encoding [\ref{['eq_enc_amp_sqrt']}]. The dashed line represents the forecast horizon using the optimal operator $M^*$ directly, while the connected points correspond to decompositions into subsets of observables [\ref{['eq_pauli_exp', 'eq_diag_spec']}]. The forecast horizon obtained from the primal optimization over the full operator basis is shown by the dashed-dotted line, with the gray area representing the standard deviation.
  • Figure 4: Optimization of the measurement layer for a QRC with internal memory ($N_A=3$ ancilla qubits, coupled to $N_I=1$ input qubits via a TFIM-based unitary). In (a), the QRC is trained on a random harmonic signal [\ref{['eq_harm_timeser']}] with $n=9$ frequencies, encoded via amplitude encoding \ref{['eq_enc_amp_sq']}. In (b), the Mackey-Glass time series [\ref{['eq_mcgl_DDE']}] is encoded via \ref{['eq_enc_amp_sqrt']}. The dashed line represents the forecast horizon (relative to the maximum tested time in (a)) using the optimal measurement operator $M^*$ (as computed by the kernel-based optimization), while the connected points correspond to decompositions into subsets of observables (\ref{['eq_pauli_exp', 'eq_diag_spec']}, ranked by their relevance, along horizontal axis). The error bars on the forecast horizon represent the standard deviation after averaging over different initial conditions. The forecast horizon obtained from the primal optimization over the full operator basis is shown by the dashed-dotted line, with the gray area representing the standard deviation. $E_{\text{train}}$ denotes the (RMS) training error.