Experience-driven discovery of planning strategies
Ruiqi He, Falk Lieder
TL;DR
The paper investigates how humans discover new planning strategies under bounded cognitive resources, proposing metacognitive reinforcement learning as a mechanism for strategy formation. It tests this idea with a novel experiment and formalizes metacognitive RL models that can drive strategy discovery, showing they better explain human strategy discovery than alternative learning mechanisms. The results indicate that metacognitive RL can produce effective planning strategies but exhibits a slower discovery rate than humans, leaving room for further improvement. Overall, the work advances understanding of how adaptive planning strategies emerge and provides a framework for modeling metacognitive strategy discovery with implications for designing human-aligned AI planners.
Abstract
One explanation for how people can plan efficiently despite limited cognitive resources is that we possess a set of adaptive planning strategies and know when and how to use them. But how are these strategies acquired? While previous research has studied how individuals learn to choose among existing strategies, little is known about the process of forming new planning strategies. In this work, we propose that new planning strategies are discovered through metacognitive reinforcement learning. To test this, we designed a novel experiment to investigate the discovery of new planning strategies. We then present metacognitive reinforcement learning models and demonstrate their capability for strategy discovery as well as show that they provide a better explanation of human strategy discovery than alternative learning mechanisms. However, when fitted to human data, these models exhibit a slower discovery rate than humans, leaving room for improvement.
