Optimal design of experiments to identify latent behavioral types
Stefano Balietti, Brennan Klein, Christoph Riedl
TL;DR
This paper tackles the challenge of efficiently designing Bayesian experiments to distinguish latent behavioral types. It introduces two computational innovations—Gaussian Process Upper Confidence Bound-Pure Exploration (GPUCB-PE) for adaptive search and a Parameter-Sampled dataset evaluation—to dramatically reduce the cost of finding informative experimental designs. Applied to a Stop-Go imperfect-information game, the approach yields data-efficient discrimination among competing behavioral models and finds Roth-Erev reinforcement learning better explains human decisions than Bayes-Nash equilibrium. The authors demonstrate substantial computational gains, show that experts' predictions are often suboptimal, and argue that the framework can be integrated into online experimentation to rapidly test and iterate behavioral hypotheses.
Abstract
Bayesian optimal experiments that maximize the information gained from collected data are critical to efficiently identify behavioral models. We extend a seminal method for designing Bayesian optimal experiments by introducing two computational improvements that make the procedure tractable: (1) a search algorithm from artificial intelligence that efficiently explores the space of possible design parameters, and (2) a sampling procedure which evaluates each design parameter combination more efficiently. We apply our procedure to a game of imperfect information to evaluate and quantify the computational improvements. We then collect data across five different experimental designs to compare the ability of the optimal experimental design to discriminate among competing behavioral models against the experimental designs chosen by a "wisdom of experts" prediction experiment. We find that data from the experiment suggested by the optimal design approach requires significantly less data to distinguish behavioral models (i.e., test hypotheses) than data from the experiment suggested by experts. Substantively, we find that reinforcement learning best explains human decision-making in the imperfect information game and that behavior is not adequately described by the Bayesian Nash equilibrium. Our procedure is general and computationally efficient and can be applied to dynamically optimize online experiments.
