Table of Contents
Fetching ...

Two-Player Zero-Sum Differential Games with One-Sided Information

Mukesh Ghimire, Zhe Xu, Yi Ren

TL;DR

This work tackles the challenge of solving two-player zero-sum differential games with continuous action spaces under one-sided information by leveraging convexification and the Isaacs condition to achieve computation whose cost is independent of the action-space size. It introduces the Continuous-Action Mixed-Strategy (CAMS) solver, which decouples the primal and dual equilibrium computations and uses DS-GDA-based backups to approximate the value function across discretized time while learning a surrogate model. The approach yields tractable, scalable equilibrium approximations for incomplete-information differential games, demonstrated on Hexner's homing game with performance advantages over state-of-the-art baselines and without reliance on coarse discretization. The work provides both theoretical framing and an algorithmic pathway for applying differential-game solvers to real-world continuous-action scenarios, with open-source code to facilitate adoption and further research.

Abstract

Unlike Poker where the action space $\mathcal{A}$ is discrete, differential games in the physical world often have continuous action spaces not amenable to discrete abstraction, rendering no-regret algorithms with $\mathcal{O}(|\mathcal{A}|)$ complexity not scalable. To address this challenge within the scope of two-player zero-sum (2p0s) games with one-sided information, we show that (1) a computational complexity independent of $|\mathcal{A}|$ can be achieved by exploiting the convexification property of incomplete-information games and the Isaacs' condition that commonly holds for dynamical systems, and that (2) the computation of the two equilibrium strategies can be decoupled under one-sidedness of information. Leveraging these insights, we develop an algorithm that successfully approximates the optimal strategy in a homing game. Code available in https://github.com/ghimiremukesh/cams/tree/workshop

Two-Player Zero-Sum Differential Games with One-Sided Information

TL;DR

This work tackles the challenge of solving two-player zero-sum differential games with continuous action spaces under one-sided information by leveraging convexification and the Isaacs condition to achieve computation whose cost is independent of the action-space size. It introduces the Continuous-Action Mixed-Strategy (CAMS) solver, which decouples the primal and dual equilibrium computations and uses DS-GDA-based backups to approximate the value function across discretized time while learning a surrogate model. The approach yields tractable, scalable equilibrium approximations for incomplete-information differential games, demonstrated on Hexner's homing game with performance advantages over state-of-the-art baselines and without reliance on coarse discretization. The work provides both theoretical framing and an algorithmic pathway for applying differential-game solvers to real-world continuous-action scenarios, with open-source code to facilitate adoption and further research.

Abstract

Unlike Poker where the action space is discrete, differential games in the physical world often have continuous action spaces not amenable to discrete abstraction, rendering no-regret algorithms with complexity not scalable. To address this challenge within the scope of two-player zero-sum (2p0s) games with one-sided information, we show that (1) a computational complexity independent of can be achieved by exploiting the convexification property of incomplete-information games and the Isaacs' condition that commonly holds for dynamical systems, and that (2) the computation of the two equilibrium strategies can be decoupled under one-sidedness of information. Leveraging these insights, we develop an algorithm that successfully approximates the optimal strategy in a homing game. Code available in https://github.com/ghimiremukesh/cams/tree/workshop

Paper Structure

This paper contains 16 sections, 10 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: SOTA algorithms like CFR require expanding over entire action space (left), whereas our algorithm only requires expanding over at most $I$ actions for P1 ($I+1$ for P2) at each decision node (right).
  • Figure 2: Hexner's game with a sample equilibrium trajectory. P1 starts to move to its target after $t_r$.
  • Figure 3: Comparisons b/w CAMS and baseline algorithms.
  • Figure 4: Trajectories using strategies from CAMS and DeepCFR. Markers indicate initial position.