Closed-loop Teaching via Demonstrations to Improve Policy Transparency
Michael S. Lee, Reid Simmons, Henny Admoni
TL;DR
The paper addresses the problem that a priori demonstration curricula for policy transparency may misalign with a learner's evolving understanding. It proposes a closed-loop teaching framework that interleaves demonstrations, diagnostic tests, and feedback, guided by a Bayesian particle-filter model of human beliefs and governed by ZPD and the testing effect. Key contributions include the particle-filter human belief model, a closed-loop teaching pipeline, and empirical evidence from a user study showing a 43% reduction in test regret compared to a baseline, across delivery and skateboard domains. The work advances policy transparency in interactive RL by enabling in situ calibration of explanations to the human learner, potentially improving interpretability and trust in AI systems.
Abstract
Demonstrations are a powerful way of increasing the transparency of AI policies. Though informative demonstrations may be selected a priori through the machine teaching paradigm, student learning may deviate from the preselected curriculum in situ. This paper thus explores augmenting a curriculum with a closed-loop teaching framework inspired by principles from the education literature, such as the zone of proximal development and the testing effect. We utilize tests accordingly to close to the loop and maintain a novel particle filter model of human beliefs throughout the learning process, allowing us to provide demonstrations that are targeted to the human's current understanding in real time. A user study finds that our proposed closed-loop teaching framework reduces the regret in human test responses by 43% over a baseline.
