Table of Contents
Fetching ...

GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms

Noah Krever, Jakub Černý, Moïse Blanchard, Christian Kroer

TL;DR

GUARD tackles the challenge of benchmarking game-theoretic algorithms on realistic security-inspired settings by generating data-driven two-player games from public sources (e.g., Movebank, OpenStreetMap, census data). It defines a three-tier framework (Graph Game, Security Game, Domain-Specific Games) with NFG and schedule-form variants, and offers a library of preconfigured instances (GSGs and ISGs) along with exportable formats for OpenSpiel and Gambit. The authors demonstrate theoretical limitations of random benchmarks and empirically show that realistic instances yield richer equilibria, more diverse supports, and more stable convergence across standard solvers, underscoring the need for realistic benchmarks in algorithmic game theory. The framework enables reproducible, domain-aligned benchmarking and provides ready-to-use data-driven instances that can inform security planning and policy-relevant research, while noting scalability and data fidelity constraints and suggesting future extensions to richer constraints and additional domains.

Abstract

Game-theoretic algorithms are commonly benchmarked on recreational games, classical constructs from economic theory such as congestion and dispersion games, or entirely random game instances. While the past two decades have seen the rise of security games -- grounded in real-world scenarios like patrolling and infrastructure protection -- their practical evaluation has been hindered by limited access to the datasets used to generate them. In particular, although the structural components of these games (e.g., patrol paths derived from maps) can be replicated, the critical data defining target values -- central to utility modeling -- remain inaccessible. In this paper, we introduce a flexible framework that leverages open-access datasets to generate realistic matrix and security game instances. These include animal movement data for modeling anti-poaching scenarios and demographic and infrastructure data for infrastructure protection. Our framework allows users to customize utility functions and game parameters, while also offering a suite of preconfigured instances. We provide theoretical results highlighting the degeneracy and limitations of benchmarking on random games, and empirically compare our generated games against random baselines across a variety of standard algorithms for computing Nash and Stackelberg equilibria, including linear programming, incremental strategy generation, and self-play with no-regret learners.

GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms

TL;DR

GUARD tackles the challenge of benchmarking game-theoretic algorithms on realistic security-inspired settings by generating data-driven two-player games from public sources (e.g., Movebank, OpenStreetMap, census data). It defines a three-tier framework (Graph Game, Security Game, Domain-Specific Games) with NFG and schedule-form variants, and offers a library of preconfigured instances (GSGs and ISGs) along with exportable formats for OpenSpiel and Gambit. The authors demonstrate theoretical limitations of random benchmarks and empirically show that realistic instances yield richer equilibria, more diverse supports, and more stable convergence across standard solvers, underscoring the need for realistic benchmarks in algorithmic game theory. The framework enables reproducible, domain-aligned benchmarking and provides ready-to-use data-driven instances that can inform security planning and policy-relevant research, while noting scalability and data fidelity constraints and suggesting future extensions to richer constraints and additional domains.

Abstract

Game-theoretic algorithms are commonly benchmarked on recreational games, classical constructs from economic theory such as congestion and dispersion games, or entirely random game instances. While the past two decades have seen the rise of security games -- grounded in real-world scenarios like patrolling and infrastructure protection -- their practical evaluation has been hindered by limited access to the datasets used to generate them. In particular, although the structural components of these games (e.g., patrol paths derived from maps) can be replicated, the critical data defining target values -- central to utility modeling -- remain inaccessible. In this paper, we introduce a flexible framework that leverages open-access datasets to generate realistic matrix and security game instances. These include animal movement data for modeling anti-poaching scenarios and demographic and infrastructure data for infrastructure protection. Our framework allows users to customize utility functions and game parameters, while also offering a suite of preconfigured instances. We provide theoretical results highlighting the degeneracy and limitations of benchmarking on random games, and empirically compare our generated games against random baselines across a variety of standard algorithms for computing Nash and Stackelberg equilibria, including linear programming, incremental strategy generation, and self-play with no-regret learners.

Paper Structure

This paper contains 74 sections, 4 theorems, 89 equations, 4 figures, 8 tables, 4 algorithms.

Key Result

Theorem 1

Let $A,B$ be sampled i.i.d. uniformly in $[0,1]$. Then (i) $V(x^*(1))\sim \text{Beta}(n,1)$, and for every $C\geq 0$, $\mathbb E_{A,B}[V(x^*(1))] = 1-\frac{1}{n+1}$ and $\mathbb P_{A,B}\left[ V(x^*(1)) < 1-\frac{C}{n} \right] \leq e^{-C},$ and (ii) there exist universal constants $c_0,c_1,c_2>0$ suc

Figures (4)

  • Figure 1: The structure of the GUARD framework.
  • Figure 2: (Left) Down-sampled elephant movements in Lobéké National Park and the corresponding GSG model. (Right) Civil infrastructure in Manhattan's Chinatown and the corresponding ISG model. Graph structures represent the traversable game environments. Red nodes indicate target locations, blue house icons are home bases.
  • Figure 3: Comparison of GSG utility and runtime sparsity results for NFG and SFG settings with varying timesteps. Each subplot shows normalized utility and runtime as a function of normalized support, with real (solid) and random (dashed) lines for each timestep setting. Errorbars on random runs reflect standard error bounds ($\frac{\sigma}{\sqrt{n}}$) over 10 random seeds.
  • Figure 4: Convergence of iterative algorithms for different game types, formulations, and real vs. randomized (dashed lines with standard error bars) runs.

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof
  • Lemma 2
  • proof