General Performance Evaluation for Competitive Resource Allocation Games via Unseen Payoff Estimation
N'yoma Diamond, Fabricio Murai
TL;DR
The paper introduces a generalized payoff L_p for competitive resource allocation games to unify performance evaluation across varying feedback regimes, including bandit and semi-bandit settings. It defines two core metrics, Max Payoff and Expected Payoff, and proposes uncertainty-aware estimators—Observable Max Payoff, Supremum Payoff, and Observable Expected Payoff—computed via feasible opponent decision sets and under a Uniform Decision Assumption. Using Colonel Blotto as a case study, it develops a graph-based model (decision graph) and pruning/bounding techniques to efficiently identify feasible opponent decisions and tighten estimates, with proofs and empirical validation showing near-ground-truth accuracy across diverse configurations. The findings enable problem-agnostic evaluation of resource-allocation algorithms in mutually adaptive adversarial settings, with practical implications for cybersecurity, economics, and related domains.
Abstract
Many high-stakes decision-making problems, such as those found within cybersecurity and economics, can be modeled as competitive resource allocation games. In these games, multiple players must allocate limited resources to overcome their opponent(s), while minimizing any induced individual losses. However, existing means of assessing the performance of resource allocation algorithms are highly disparate and problem-dependent. As a result, evaluating such algorithms is unreliable or impossible in many contexts and applications, especially when considering differing levels of feedback. To resolve this problem, we propose a generalized definition of payoff which uses an arbitrary user-provided function. This unifies performance evaluation under all contexts and levels of feedback. Using this definition, we develop metrics for evaluating player performance, and estimators to approximate them under uncertainty (i.e., bandit or semi-bandit feedback). These metrics and their respective estimators provide a problem-agnostic means to contextualize and evaluate algorithm performance. To validate the accuracy of our estimator, we explore the Colonel Blotto ($\mathcal{CB}$) game as an example. To this end, we propose a graph-pruning approach to efficiently identify feasible opponent decisions, which are used in computing our estimation metrics. Using various resource allocation algorithms and game parameters, a suite of $\mathcal{CB}$ games are simulated and used to compute and evaluate the quality of our estimates. These simulations empirically show our approach to be highly accurate at estimating the metrics associated with the unseen outcomes of an opponent's latent behavior.
