Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round
Avrim Blum, Melissa Dutz
TL;DR
This paper tackles the problem of winning in symmetric, repeated two-player zero-sum games when payoffs are unobserved and the opponent follows behaviorally biased strategies. It introduces a prediction-and-best-response framework that leverages a halving-based method to forecast the opponent's actions and bias-specific strategies to learn effective responses, achieving near-certain wins against several biases without knowledge of the payoff matrix. The authors provide concrete algorithms and failure bounds for beating Myopic Best Responders, Gambler's Fallacy, Win-Stay Lose-Shift (including Tie-Shift and Tie-Stay variants), Follow-the-Leader (with unlimited and limited history), and Highest Average Payoff opponents, along with generalizations to unknown strategies drawn from a known set. This work highlights the exploitability of predictable biases in payoff-free settings and suggests directions for evaluating probabilistic biases and more complex game structures.
Abstract
Gameplay under various forms of uncertainty has been widely studied. Feldman et al. (2010) studied a particularly low-information setting in which one observes the opponent's actions but no payoffs, not even one's own, and introduced an algorithm which guarantees one's payoff nonetheless approaches the minimax optimal value (i.e., zero) in a symmetric zero-sum game. Against an opponent playing a minimax-optimal strategy, approaching the value of the game is the best one can hope to guarantee. However, a wealth of research in behavioral economics shows that people often do not make perfectly rational, optimal decisions. Here we consider whether it is possible to actually win in this setting if the opponent is behaviorally biased. We model several deterministic, biased opponents and show that even without knowing the game matrix in advance or observing any payoffs, it is possible to take advantage of each bias in order to win nearly every round (so long as the game has the property that each action beats and is beaten by at least one other action). We also provide a partial characterization of the kinds of biased strategies that can be exploited to win nearly every round, and provide algorithms for beating some kinds of biased strategies even when we don't know which strategy the opponent uses.
