Comparing Reinforcement Learning and Human Learning using the Game of Hidden Rules
Eric Pulick, Vladimir Menkov, Yonatan Mintz, Paul Kantor, Vicki Bier
TL;DR
The paper addresses how task structure influences learning by introducing the Game of Hidden Rules (GOHR), a controllable environment that encodes hidden rules as rule lines and atoms on a $6\times 6$ board with $144$ actions and $4$ buckets. It compares human learners to two RL agents (DQN and REINFORCE) across stationary vs non-stationary rules and varying rule generality, using memory-aware featurizations. Key contributions include a flexible rule-syntax enabling precise structure manipulation, systematic HL/RL comparisons across structured tasks, and empirical findings that humans and RL respond differently to task structure, with RL more easily adapting to generality while humans show selective difficulties. This work advances task-oriented understanding of RL and HL and provides a shareable platform for curricula, transfer learning, and human–machine teaming studies.
Abstract
Reliable real-world deployment of reinforcement learning (RL) methods requires a nuanced understanding of their strengths and weaknesses and how they compare to those of humans. Human-machine systems are becoming more prevalent and the design of these systems relies on a task-oriented understanding of both human learning (HL) and RL. Thus, an important line of research is characterizing how the structure of a learning task affects learning performance. While increasingly complex benchmark environments have led to improved RL capabilities, such environments are difficult to use for the dedicated study of task structure. To address this challenge we present a learning environment built to support rigorous study of the impact of task structure on HL and RL. We demonstrate the environment's utility for such study through example experiments in task structure that show performance differences between humans and RL algorithms.
