Tree-Based Stochastic Optimization for Solving Large-Scale Urban Network Security Games
Shuxin Zhuang, Linjian Meng, Shuxin Li, Minming Li, Youzhi Zhang
TL;DR
The paper tackles the intractability of computing Nash equilibria in large-scale urban network security games (UNSGs) caused by combinatorial action spaces. It introduces Tree-based Stochastic Optimization (TSO), which uses a tree-based action representation to sample non-enumerable actions and a sample-and-prune mechanism to prevent convergence to suboptimal local optima, with theoretical equivalence to the unbiased Nash Advantage Loss (NAL) framework. By deriving a tree-based Nash Advantage Loss (NAL) and proving gradient equivalence at stationary points, the method enables unbiased gradient-based optimization for NE in UNSGs and scales to graphs with millions of potential strategies. Empirical results show TSO outperforming PSRO and competing baselines across small, medium, and large-scale UNSGs, including asymmetric payoff and decentralized defender settings, while achieving faster training times; this demonstrates a practical, scalable approach to NE finding in complex security games and potential applicability to other large-scale normal-form games. $u_i^{ au}(oldsymbol{x}) = u_i(oldsymbol{x}) - au oldsymbol{x}_i^{ op} log oldsymbol{x}_i$ and related formulations underpin the entropy-regularized, low-variance optimization core of the framework.
Abstract
Urban Network Security Games (UNSGs), which model the strategic allocation of limited security resources on city road networks, are critical for urban safety. However, finding a Nash Equilibrium (NE) in large-scale UNSGs is challenging due to their massive and combinatorial action spaces. One common approach to addressing these games is the Policy-Space Response Oracle (PSRO) framework, which requires computing best responses (BR) at each iteration. However, precisely computing exact BRs is impractical in large-scale games, and employing reinforcement learning to approximate BRs inevitably introduces errors, which limits the overall effectiveness of the PSRO methods. Recent advancements in leveraging non-convex stochastic optimization to approximate an NE offer a promising alternative to the burdensome BR computation. However, utilizing existing stochastic optimization techniques with an unbiased loss function for UNSGs remains challenging because the action spaces are too vast to be effectively represented by neural networks. To address these issues, we introduce Tree-based Stochastic Optimization (TSO), a framework that bridges the gap between the stochastic optimization paradigm for NE-finding and the demands of UNSGs. Specifically, we employ the tree-based action representation that maps the whole action space onto a tree structure, addressing the challenge faced by neural networks in representing actions when the action space cannot be enumerated. We then incorporate this representation into the loss function and theoretically demonstrate its equivalence to the unbiased loss function. To further enhance the quality of the converged solution, we introduce a sample-and-prune mechanism that reduces the risk of being trapped in suboptimal local optima. Extensive experimental results indicate the superiority of TSO over other baseline algorithms in addressing the UNSGs.
