Interactive Explanations for Reinforcement-Learning Agents

Yotam Amitai; Ofra Amir; Guy Avni

Interactive Explanations for Reinforcement-Learning Agents

Yotam Amitai, Ofra Amir, Guy Avni

TL;DR

The paper tackles the challenge of making reinforcement-learning agents explainable through interactive dialogue rather than static summaries. It introduces ASQ-IT, a system that retrieves agent-behavior clips by translating user questions into a fragment of $LTL_f$ and searching an offline library via automata-based processing. The authors demonstrate usability with lay users and show improved debugging performance against a static baseline, highlighting increased engagement and hypothesis exploration. This approach enables structured, user-driven explanations of sequential agent behavior with potential for integration into broader explanatory toolkits in safety-critical domains.

Abstract

As reinforcement learning methods increasingly amass accomplishments, the need for comprehending their solutions becomes more crucial. Most explainable reinforcement learning (XRL) methods generate a static explanation depicting their developers' intuition of what should be explained and how. In contrast, literature from the social sciences proposes that meaningful explanations are structured as a dialog between the explainer and the explainee, suggesting a more active role for the user and her communication with the agent. In this paper, we present ASQ-IT -- an interactive explanation system that presents video clips of the agent acting in its environment based on queries given by the user that describe temporal properties of behaviors of interest. Our approach is based on formal methods: queries in ASQ-IT's user interface map to a fragment of Linear Temporal Logic over finite traces (LTLf), which we developed, and our algorithm for query processing is based on automata theory. User studies show that end-users can understand and formulate queries in ASQ-IT and that using ASQ-IT assists users in identifying faulty agent behaviors.

Interactive Explanations for Reinforcement-Learning Agents

TL;DR

Abstract

Interactive Explanations for Reinforcement-Learning Agents

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (9)