Finding equilibria: simpler for pessimists, simplest for optimists
Léonard Brice, Thomas Henzinger, K. S. Thejaswini
TL;DR
The paper addresses equilibria in multiplayer simple stochastic games under risk-sensitive preferences by examining entropic risk (ER) and introducing extreme risk (XR). It proves ER-based equilibria exist for non-negative terminal rewards but shows the constrained existence problem is undecidable, motivating XR as a tractable qualitative alternative that aligns with ER in extreme parameter limits. The authors establish XRSE existence for non-negative rewards, show XR-constrained existence is NP-complete in general and PTIME-complete when all players are optimists, and provide a polynomial-time construction method with finite-memory guarantees. This yields the first decidable fragment for equilibria in simple stochastic games without restricting strategy types or the number of players, broadening applicability to safety-critical and adversarial multi-agent settings where risk perception is domain-dependent.
Abstract
We consider simple stochastic games with terminal-node rewards and multiple players, who have differing perceptions of risk. Specifically, we study risk-sensitive equilibria (RSEs), where no player can improve their perceived reward -- based on their risk parameter -- by deviating from their strategy. We start with the entropic risk (ER) measure, which is widely studied in finance. ER characterises the players on a quantitative spectrum, with positive risk parameters representing optimists and negative parameters representing pessimists. Building on known results for Nash equilibira, we show that RSEs exist under ER for all games with non-negative terminal rewards. However, using similar techniques, we also show that the corresponding constrained existence problem -- to determine whether an RSE exists under ER with the payoffs in given intervals -- is undecidable. To address this, we introduce a new, qualitative risk measure -- called extreme risk (XR) -- which coincides with the limit cases of positively infinite and negatively infinite ER parameters. Under XR, every player is an extremist: an extreme optimist perceives their reward as the maximum payoff that can be achieved with positive probability, while an extreme pessimist expects the minimum payoff achievable with positive probability. Our first main result proves the existence of RSEs also under XR for non-negative terminal rewards. Our second main result shows that under XR the constrained existence problem is not only decidable, but NP-complete. Moreover, when all players are extreme optimists, the problem becomes PTIME-complete. Our algorithmic results apply to all rewards, positive or negative, establishing the first decidable fragment for equilibria in simple stochastic games with terminal objectives without restrictions on strategy types or number of players.
