Characterizing Robustness of Strategies to Novelty in Zero-Sum Open Worlds
Mayank Kejriwal, Shilpa Thomas, Hongyu Li
TL;DR
This work tackles the problem of how fixed strategies in open-world two-player zero-sum games degrade under novel rule or payoff changes. It introduces a domain-agnostic matrix framework with two metrics, per-agent robustness and global impact, and applies them to large CPS-like corpora in Iterated Prisoner’s Dilemma and heads-up Texas Hold’em Poker to reveal systematic patterns of fragility and resilience. The study reports substantial cross-agent heterogeneity and identifies specific novelties that cause strong systemic disruption, underscoring the need for open-world learning and robust evaluation beyond closed benchmarks. The findings provide quantitative baselines and practical guidance for designing resilient autonomous systems in adversarial and dynamic environments, with future work focusing on predictive models and adaptive policies across more domains, including LLM-driven agents.
Abstract
In open-world environments, artificial agents must often contend with novel conditions that deviate from their training or design assumptions. This paper studies the robustness of fixed-strategy agents to such novelty within the setting of two-player zero-sum games. We present a general framework for characterizing the impact of environmental novelties, such as changes in payoff structure or action constraints, on agent performance in two distinct domains: Iterated Prisoner's Dilemma (IPD) and heads-up Texas Hold'em Poker. Novelty is operationalized as a perturbation of the game's rules or scoring mechanics, while agent behavior remains fixed. To measure the effects, we introduce two metrics: per-agent robustness, quantifying the relative performance shift of each strategy across novelties, and global impact, summarizing the population-wide disruption caused by a novelty. Our experiments, comprising 30 IPD agents across 20 payoff matrix novelties and 10 Poker agents across 5 rule-based novelties, reveal systematic patterns in robustness and highlight certain novelties that induce severe destabilization. The results offer insights into agent generalizability under perturbation and provide a quantitative basis for designing safer and more resilient autonomous systems in adversarial and dynamic environments.
