Table of Contents
Fetching ...

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

Zhuoyue Chen, Kechao Cai

Abstract

Quantum multi-armed bandits (MAB) and stochastic linear bandits (SLB) have recently attracted significant attention, as their quantum counterparts can achieve quadratic speedups over classical MAB and SLB. However, most existing quantum MAB algorithms assume ideal quantum Monte Carlo (QMC) procedures on noise-free circuits, overlooking the impact of noise in current noisy intermediate-scale quantum (NISQ) devices. In this paper, we study a noise-robust QMC algorithm that improves estimation accuracy when querying quantum reward oracles. Building on this estimator, we propose noise-robust QMAB and QSLB algorithms that enhance performance in noisy environments while preserving the advantage over classical methods. Experiments show that our noise-robust approach improves QMAB estimation accuracy and reduces regret under several quantum noise models.

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

Abstract

Quantum multi-armed bandits (MAB) and stochastic linear bandits (SLB) have recently attracted significant attention, as their quantum counterparts can achieve quadratic speedups over classical MAB and SLB. However, most existing quantum MAB algorithms assume ideal quantum Monte Carlo (QMC) procedures on noise-free circuits, overlooking the impact of noise in current noisy intermediate-scale quantum (NISQ) devices. In this paper, we study a noise-robust QMC algorithm that improves estimation accuracy when querying quantum reward oracles. Building on this estimator, we propose noise-robust QMAB and QSLB algorithms that enhance performance in noisy environments while preserving the advantage over classical methods. Experiments show that our noise-robust approach improves QMAB estimation accuracy and reduces regret under several quantum noise models.
Paper Structure (20 sections, 17 equations, 3 figures, 3 algorithms)

This paper contains 20 sections, 17 equations, 3 figures, 3 algorithms.

Figures (3)

  • Figure 1: Regret comparison results in noiseless settings: (a) and (b) compare QUCB and UCB under two reward gap settings, and (c) compares QLinUCB and LinUCB.
  • Figure 2: Cumulative regret of QUCB variants under four noise settings: exponential decoherence, readout, depolarizing, and amplitude damping.
  • Figure 3: Cumulative regret of QLinUCB variants under four noise settings: exponential decoherence, readout, depolarizing, and amplitude damping.