Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round

Avrim Blum; Melissa Dutz

Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round

Avrim Blum, Melissa Dutz

TL;DR

This paper tackles the problem of winning in symmetric, repeated two-player zero-sum games when payoffs are unobserved and the opponent follows behaviorally biased strategies. It introduces a prediction-and-best-response framework that leverages a halving-based method to forecast the opponent's actions and bias-specific strategies to learn effective responses, achieving near-certain wins against several biases without knowledge of the payoff matrix. The authors provide concrete algorithms and failure bounds for beating Myopic Best Responders, Gambler's Fallacy, Win-Stay Lose-Shift (including Tie-Shift and Tie-Stay variants), Follow-the-Leader (with unlimited and limited history), and Highest Average Payoff opponents, along with generalizations to unknown strategies drawn from a known set. This work highlights the exploitability of predictable biases in payoff-free settings and suggests directions for evaluating probabilistic biases and more complex game structures.

Abstract

Gameplay under various forms of uncertainty has been widely studied. Feldman et al. (2010) studied a particularly low-information setting in which one observes the opponent's actions but no payoffs, not even one's own, and introduced an algorithm which guarantees one's payoff nonetheless approaches the minimax optimal value (i.e., zero) in a symmetric zero-sum game. Against an opponent playing a minimax-optimal strategy, approaching the value of the game is the best one can hope to guarantee. However, a wealth of research in behavioral economics shows that people often do not make perfectly rational, optimal decisions. Here we consider whether it is possible to actually win in this setting if the opponent is behaviorally biased. We model several deterministic, biased opponents and show that even without knowing the game matrix in advance or observing any payoffs, it is possible to take advantage of each bias in order to win nearly every round (so long as the game has the property that each action beats and is beaten by at least one other action). We also provide a partial characterization of the kinds of biased strategies that can be exploited to win nearly every round, and provide algorithms for beating some kinds of biased strategies even when we don't know which strategy the opponent uses.

Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round

TL;DR

Abstract

Paper Structure (22 sections, 8 theorems, 1 table, 8 algorithms)

This paper contains 22 sections, 8 theorems, 1 table, 8 algorithms.

Introduction
Setting
Models of behaviorally-biased opponents
Preliminaries and Intuition
Strategies for Beating Behaviorally Biased Opponents
Myopic Best Responder
Gambler's Fallacy Opponent
Win-Stay, Lose-Shift Opponent
Variant: Tie-Shift
Variant: Tie-Stay
Follow-the-Leader Opponent
Variant: Limited History
Highest Average Payoff Opponent
Generalizing
Other Behaviorally-Biased Strategies
...and 7 more sections

Key Result

Theorem 2

Given an opponent playing according to a consistent deterministic strategy which breaks ties among actions according to fixed ordering over actions, Algorithm alg_prediction_by_halving makes up to O($n^2$) prediction mistakes in $O(3^{n^2}n!R)$ time per round, where $R$ is the runtime in which the o

Theorems & Definitions (9)

Definition 1: Permissible Game
Theorem 2
Theorem 3
Theorem 4
Theorem 5
Theorem 6
Theorem 11
Theorem 12
Theorem 13

Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round

TL;DR

Abstract

Winning Without Observing Payoffs: Exploiting Behavioral Biases to Win Nearly Every Round

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (9)