Table of Contents
Fetching ...

Logarithmic-Regret Quantum Learning Algorithms for Zero-Sum Games

Minbo Gao, Zhengfeng Ji, Tongyang Li, Qisheng Wang

TL;DR

The first online quantum algorithm for solving zero-sum games with $\widetilde O(1)$ regret under the game setting is proposed, which computes an $\varepsilon$-approximate Nash equilibrium of an $m \times n$ matrix zero-sum game in quantum time.

Abstract

We propose the first online quantum algorithm for solving zero-sum games with $\widetilde O(1)$ regret under the game setting. Moreover, our quantum algorithm computes an $\varepsilon$-approximate Nash equilibrium of an $m \times n$ matrix zero-sum game in quantum time $\widetilde O(\sqrt{m+n}/\varepsilon^{2.5})$. Our algorithm uses standard quantum inputs and generates classical outputs with succinct descriptions, facilitating end-to-end applications. Technically, our online quantum algorithm "quantizes" classical algorithms based on the optimistic multiplicative weight update method. At the heart of our algorithm is a fast quantum multi-sampling procedure for the Gibbs sampling problem, which may be of independent interest.

Logarithmic-Regret Quantum Learning Algorithms for Zero-Sum Games

TL;DR

The first online quantum algorithm for solving zero-sum games with regret under the game setting is proposed, which computes an -approximate Nash equilibrium of an matrix zero-sum game in quantum time.

Abstract

We propose the first online quantum algorithm for solving zero-sum games with regret under the game setting. Moreover, our quantum algorithm computes an -approximate Nash equilibrium of an matrix zero-sum game in quantum time . Our algorithm uses standard quantum inputs and generates classical outputs with succinct descriptions, facilitating end-to-end applications. Technically, our online quantum algorithm "quantizes" classical algorithms based on the optimistic multiplicative weight update method. At the heart of our algorithm is a fast quantum multi-sampling procedure for the Gibbs sampling problem, which may be of independent interest.
Paper Structure (31 sections, 26 theorems, 102 equations, 1 table, 4 algorithms)

This paper contains 31 sections, 26 theorems, 102 equations, 1 table, 4 algorithms.

Key Result

Theorem 1.1

Suppose $T\leq\widetilde{O}\lparen m+n\rparen$. There is a quantum online algorithm for zero-sum game $\mathbf{A} \in \mathbb{R}^{m \times n}$ with $\lVert\mathbf{A}\rVert \leq 1$ such that it achieves a total regret of $O\lparen\log\lparen mn\rparen\rparen$ with high probability after $T$ rounds, w

Theorems & Definitions (48)

  • Theorem 1.1: Online learning for zero-sum games
  • Corollary 1.2: Computing Nash equilibrium
  • Definition 3.1: Approximate Gibbs sampling oracle
  • Theorem 3.2
  • Corollary 3.3
  • Lemma 4.1: Polynomial approximation, Lemma 7 of vAG2019
  • Theorem 4.2: Fast quantum multi-Gibbs sampling
  • Remark 4.3
  • Remark 4.4
  • Remark 4.5
  • ...and 38 more