Breaking concentration barriers for quantum extreme learning on digital quantum processors

Timothée Dao; Ege Yilmaz; Ibrahim Shehzad; Christophe Pere; Kumar Ghosh; Isabelle Wittmann; Thomas Brunschwiler; Giorgio Cortiana; Corey O'Meara; Stefan Woerner; Francesco Tacchino

Breaking concentration barriers for quantum extreme learning on digital quantum processors

Timothée Dao, Ege Yilmaz, Ibrahim Shehzad, Christophe Pere, Kumar Ghosh, Isabelle Wittmann, Thomas Brunschwiler, Giorgio Cortiana, Corey O'Meara, Stefan Woerner, Francesco Tacchino

Abstract

Reservoir computing leverages rich, non-linear dynamics to process temporal data. Quantum variants promise enhanced expressivity from high-dimensional Hilbert spaces, yet their practical applicability is hindered by hardware noise and concentration effects that can erase input-output distinguishability at large system sizes. In this work, we present and experimentally demonstrate a Quantum Extreme Learning Machine (QELM) tailored to state-of-the-art superconducting platforms, employing up to 124 qubits and circuits with more than 5,000 two-qubit gates on IBM Quantum computers. We introduce a practical multi-objective hyperparameter tuning strategy that jointly monitors observable variability, capacity, and task performance to identify noise-robust operating points. In addition, we develop a local eigentask analysis that enables computationally efficient feature selection and effective information retrieval. We report evidence of a regime of optimality that is identifiable at small scales and transferable across tasks and larger systems, and we achieve performances competitive with leading classical baselines on representative benchmarks for time-series forecasting and satellite image classification. Together, our results establish a viable and robust framework for large-scale, pre-fault-tolerant quantum machine learning and provide a foundation for extending reservoir-based methods to more expressive architectures and real-world scenarios.

Breaking concentration barriers for quantum extreme learning on digital quantum processors

Abstract

Paper Structure (15 sections, 16 equations, 10 figures, 3 tables)

This paper contains 15 sections, 16 equations, 10 figures, 3 tables.

Model design
Universal operating regime
Applications and results
Conclusions
Methods
Device and execution details
Quantum processor
Execution summary
Measurement model and linear readout as POVM post‑processing
QELM extension for Multi-Step Energy Price Forecasting
Additional Landsat results
Learning Curves
Pauli and eigentask readouts at 124 qubits
Sensitivity to the NSR cutoff
Classical baselines

Figures (10)

Figure 1: Schematic representation of the QELM architecture. The input vector $\mathbf{u}$ is injected through a sequence of encoding layers, where each two‑qubit block applies a data‑dependent transverse‑field kick with strength $b = g(u[i])$. These layers alternate with dynamics layers, in which the blocks implement fixed, randomly initialized kicked‑Ising evolutions governed by parameters $(J, h, b)$ that remain constant after initialization. Qubits are arranged in a ring topology and initialized in Bell pairs. After the full sequence of encoding and dynamical evolutions, expectation values of a set of observables $\mathcal{R}$ are measured from the final quantum state, forming the reservoir feature vector $\mathbf{r}(\mathbf{u})$. A classical linear readout layer combines these features to produce the final model output.
Figure 2: Pareto‑front projections obtained from multi‑objective hyperparameter optimization across QELM architectures of different sizes (6, 8, 10, and 12 qubits). Each dot represents a sampled hyperparameter configuration, colored by system size, and the solid lines indicate the corresponding Pareto fronts. Yellow stars mark the hyperparameters selected from the initial joint optimization (Table \ref{['tab:hyperparameters']}). (a) Trade‑off between NARMA performance under realistic shot noise and bond dimension, illustrating how the chosen configuration consistently lies near the Pareto‑optimal region across model sizes. (b) Trade‑off between observable variability and bond dimension, showing that the selected operating point occupies a regime of high entanglement richness while preserving sufficient output variability to avoid over‑concentration. Together, these projections highlight the emergence of a universal operating regime that generalizes well across architectures.
Figure 3: Hardware validation of large-scale QELM architectures on the ibm_quebec processor. (a) Performance on the NARMA‑$n$ benchmark for 24‑ and 72‑qubit QELMs executed on hardware. Here $n$ is the task order (target depends on the last $n$ outputs and specific input delays). $R^2$ is reported versus $n$; vertical dotted lines indicate the rolling‑window size $n=L=N/2$. The 72‑qubit model maintains accuracy to larger $n$, consistent with its greater effective memory. (b) Classification performance on the Landsat dataset for the 124-qubit QELM, shown as a function of the number of measurement shots used per input sample. Training and test F1 macro scores are displayed together with $95\%$ bootstrap confidence intervals, demonstrating systematic improvement in generalization as shot precision increases. (c) Learning curves for the same 124-qubit model, reporting the F1 macro score versus the size of the training set. Test performance improves monotonically with additional training data, while the training score remains consistently high, indicating that the linear readout operates far from saturation. Shaded regions and error bars again denote $95\%$ bootstrap confidence intervals.
Figure 4: Classification performance (F1 macro) for the 124‑qubit Landsat experiment using different QELM readouts: single‑basis $X$, full Pauli, local eigentasks (ET), and ET with NSR‑based cutoff. Bars compare weight‑1 vs. weight‑2 features and unit-variance vs. signal‑aware scaling. Horizontal lines denote classical baselines trained directly on the raw features without tuning (see Appendix \ref{['app:results/classical_baselines']} for configurations). Each baseline score is the average over 100 independent initializations and training runs.
Figure 5: Qubit layout of the 124‑qubit ring used on ibm_quebec. The heavy‑hexagonal lattice is shown in grey; the selected 124‑qubit cycle is highlighted. Node color encodes $T_2$ times, and edge color shows the calibrated $CZ$ error rates of the device.
...and 5 more figures

Breaking concentration barriers for quantum extreme learning on digital quantum processors

Abstract

Breaking concentration barriers for quantum extreme learning on digital quantum processors

Authors

Abstract

Table of Contents

Figures (10)