Table of Contents
Fetching ...

Quantum Non-Linear Bandit Optimization

Zakaria Shams Siam, Chaowen Guan, Chong Liu

TL;DR

This work introduces Q-NLB-UCB, a quantum algorithm for non-linear bandit optimization that uses parametric function approximation to achieve an input-dimension-free regret bound of $R_T = Oig(d_w^2 ext{log}^{3/2}(T) ext{log}(d_w ext{log} T)ig)$. Central to the approach are a quantum regression oracle, quantum fast-forward, and quantum Monte Carlo mean estimation, which together yield accelerated estimation of the surrogate parameters and efficient, staged uncertainty management. The method generalizes beyond kernels by allowing flexible surrogate families (linear, quadratic, or neural networks) and demonstrates superior performance over quantum baselines on synthetic benchmarks and real AutoML tasks. The results suggest that quantum-enhanced, parametric surrogate-based bandits can solve high-dimensional, black-box optimization problems more efficiently than prior RKHS-based quantum methods, with potential impact on hyperparameter tuning and drug discovery. The paper provides both rigorous regret guarantees and empirical validation to support these claims.

Abstract

We study non-linear bandit optimization where the learner maximizes a black-box function with zeroth order function oracle, which has been successfully applied in many critical applications such as drug discovery and hyperparameter tuning. Existing works have showed that with the aid of quantum computing, it is possible to break the $Ω(\sqrt{T})$ regret lower bound in classical settings and achieve the new $O(\mathrm{poly}\log T)$ upper bound. However, they usually assume that the objective function sits within the reproducing kernel Hilbert space and their algorithms suffer from the curse of dimensionality. In this paper, we propose the new Q-NLB-UCB algorithm which uses the novel parametric function approximation technique and enjoys performance improvement due to quantum fast-forward and quantum Monte Carlo mean estimation. We prove that the regret bound of Q-NLB-UCB is not only $O(\mathrm{poly}\log T)$ but also input dimension-free, making it applicable for high-dimensional tasks. At the heart of our analyses are a new quantum regression oracle and a careful construction of parameter uncertainty region. Our algorithm is also validated for its efficiency on both synthetic and real-world tasks.

Quantum Non-Linear Bandit Optimization

TL;DR

This work introduces Q-NLB-UCB, a quantum algorithm for non-linear bandit optimization that uses parametric function approximation to achieve an input-dimension-free regret bound of . Central to the approach are a quantum regression oracle, quantum fast-forward, and quantum Monte Carlo mean estimation, which together yield accelerated estimation of the surrogate parameters and efficient, staged uncertainty management. The method generalizes beyond kernels by allowing flexible surrogate families (linear, quadratic, or neural networks) and demonstrates superior performance over quantum baselines on synthetic benchmarks and real AutoML tasks. The results suggest that quantum-enhanced, parametric surrogate-based bandits can solve high-dimensional, black-box optimization problems more efficiently than prior RKHS-based quantum methods, with potential impact on hyperparameter tuning and drug discovery. The paper provides both rigorous regret guarantees and empirical validation to support these claims.

Abstract

We study non-linear bandit optimization where the learner maximizes a black-box function with zeroth order function oracle, which has been successfully applied in many critical applications such as drug discovery and hyperparameter tuning. Existing works have showed that with the aid of quantum computing, it is possible to break the regret lower bound in classical settings and achieve the new upper bound. However, they usually assume that the objective function sits within the reproducing kernel Hilbert space and their algorithms suffer from the curse of dimensionality. In this paper, we propose the new Q-NLB-UCB algorithm which uses the novel parametric function approximation technique and enjoys performance improvement due to quantum fast-forward and quantum Monte Carlo mean estimation. We prove that the regret bound of Q-NLB-UCB is not only but also input dimension-free, making it applicable for high-dimensional tasks. At the heart of our analyses are a new quantum regression oracle and a careful construction of parameter uncertainty region. Our algorithm is also validated for its efficiency on both synthetic and real-world tasks.

Paper Structure

This paper contains 28 sections, 17 theorems, 100 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Lemma 3.1

Given the access to a quantum sampling oracle $\mathcal{O}_Y$ (and its inverse $\mathcal{O}_Y^\dagger$) that encodes the distribution of a random variable $Y$, as defined in def:sampling_oracle.

Figures (3)

  • Figure 1: Cumulative regrets (the lower the better) of all compared quantum bandit algorithms on synthetic functions.
  • Figure 2: Cumulative regrets (the lower the better) of all compared quantum bandit algorithms in real-world SVM hyperparameter tuning tasks.
  • Figure 3: High-level description of a quantum algorithm computing $D^{T_0} {{\bf w}}$

Theorems & Definitions (35)

  • Lemma 3.1: Quantum Monte Carlo mean estimator Mon15
  • Lemma 5.1: Adapted from Theorem 5.2 in LW23
  • Lemma 5.2: Informal statement of quantum fast-forward AGJK20
  • Theorem 5.3
  • Theorem 5.4: Cumulative regret bound of Q-NLB-UCB
  • Remark 5.5
  • Remark 5.6
  • Lemma 5.7: Number of stages in Q-NLB-UCB
  • Lemma 5.8: Confidence bound of Q-NLB-UCB
  • Remark 5.9
  • ...and 25 more