Learning in Quantum Common-Interest Games and the Separability Problem

Wayne Lin; Georgios Piliouras; Ryann Sim; Antonios Varvitsiotis

Learning in Quantum Common-Interest Games and the Separability Problem

Wayne Lin, Georgios Piliouras, Ryann Sim, Antonios Varvitsiotis

TL;DR

This paper introduces quantum common-interest games (CIGs) where players’ strategies are density matrices and share a common bilinear utility, linking the Nash equilibria to the KKT points of the Best Separable State (BSS) problem. It develops non-commutative analogues of classical learning dynamics—the Linear Quantum Replicator Dynamics (lin-QREP) and Linear Matrix Multiplicative Weights Update (lin-MMWU)—and analyzes their fixed points, Lyapunov structure, and convergence properties via a quantum Shahshahani metric. The authors prove that NE are fixed points for these dynamics, that limit points comprise fixed points (potentially larger than NE), and provide extensive experiments showing lin-QREP converges to NE in many instances while lin-MMWU can converge to non-NE fixed points unless perturbed. They also demonstrate alternating Best Response dynamics converge to NE in two-player QCIGs and present BSS-focused experiments comparing BR with MMWU variants against PPT-SDP ground truths, including large-scale and perturbation-enhanced results. Overall, the work bridges optimization and quantum game theory, offering decentralized learning avenues for BSS and advancing understanding of quantum-cooperative dynamics with practical implications for entanglement-aware optimization and quantum information tasks.

Abstract

Learning in games has emerged as a powerful tool for machine learning with numerous applications. Quantum games model interactions between strategic players who have access to quantum resources, and several recent works have studied {learning in} the competitive regime of quantum zero-sum games. Going beyond this setting, we introduce quantum common-interest games (CIGs) where players have density matrices as strategies and their interests are perfectly aligned. We bridge the gap between optimization and game theory by establishing the equivalence between KKT (first-order stationary) points of an instance of the Best Separable State (BSS) problem and the Nash equilibria of its corresponding quantum CIG. This allows learning dynamics for the quantum CIG to be seen as decentralized algorithms for the BSS problem. Taking the perspective of learning in games, we then introduce non-commutative extensions of the continuous-time replicator dynamics and the discrete-time best response dynamics/linear multiplicative weights update for learning in quantum CIGs. We prove analogues of classical convergence results of the dynamics and explore differences which arise in the quantum setting. Finally, we corroborate our theoretical findings through extensive experiments.

Learning in Quantum Common-Interest Games and the Separability Problem

TL;DR

Abstract

Paper Structure (39 sections, 17 theorems, 121 equations, 19 figures, 4 tables)

This paper contains 39 sections, 17 theorems, 121 equations, 19 figures, 4 tables.

Introduction
Background and Related Work on Learning in Games
Classical common-interest and potential games
Learning dynamics in classical games
Learning dynamics in quantum games
Quantum preliminaries
Geometry of the set of density matrices.
Quantum Common-Interest Games and the BSS problem
Quantum games
Quantum games.
Quantum common-interest games.
Nash equilibria and exploitability.
Relation between quantum CIGs and the BSS problem
Best Response Dynamics
Exploitability experiments.
...and 24 more sections

Key Result

Theorem 3.1

The Nash equilibria of a two-player quantum common-interest game with common utility function $u(\rho, \sigma)=\Tr(R(\rho \otimes \sigma))$ correspond to the KKT points of BSS.

Figures (19)

Figure 1: Exploitability of trajectories under \ref{['BR']}. All exploitabilities go to zero quickly.
Figure 2: Exploitability of trajectories under \ref{['lin-QREP']}. The exploitability of all trajectories goes to zero.
Figure 3: Comparing exploitability of \ref{['lin-MMWU']} with different stepsizes $\eta$. Trajectories with the same color were for the same game, with uniform initialization.
Figure 4: Counterexamples showing that convergence to a fixed point (low Frobenius norm) does not imply zero exploitability. All of the runs of the experiment converge to a fixed point, but several remain bounded away from zero exploitability.
Figure 5: Comparing exploitability of \ref{['lin-MMWU']} with different stepsizes $\eta$. Trajectories with the same color were for the same game, with uniform initialization. Note that with larger stepsize the exploitability tends to converge faster, but with smaller stepsize the trajectories tend to end up with lower exploitability.
...and 14 more figures

Theorems & Definitions (34)

Theorem 3.1
proof
Theorem 3.2
proof
Theorem 4.1
proof
Remark 5.1: Non-commutative extensions
Theorem 5.1
proof
Theorem 5.2
...and 24 more

Learning in Quantum Common-Interest Games and the Separability Problem

TL;DR

Abstract

Learning in Quantum Common-Interest Games and the Separability Problem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (34)