Abstracted Trajectory Visualization for Explainability in Reinforcement Learning
Yoshiki Takagi, Roderick Tabalba, Nurit Kirshenbaum, Jason Leigh
TL;DR
The paper tackles the challenge of explaining RL agents to non-RL experts by introducing abstracted trajectory visualization. It combines a $β$-VAE–based trajectory extraction that yields latent representations $z$ and a spatio-temporal clustering step using ST-DBSCAN to identify major states, which are then decoded to form abstracted trajectories. An interactive interface with a map view and a slider view is proposed to reveal temporal dynamics; a preliminary online study indicates that abstracted trajectories support non-RL experts in inferring agent behavior as effectively as complete trajectories, with a user preference for the slider-based navigation. These findings suggest the approach can broaden participation in RL design discussions by providing concise, interpretable explanations, though usability of the map component and alignment of abstractions with human intuition require further refinement.
Abstract
Explainable AI (XAI) has demonstrated the potential to help reinforcement learning (RL) practitioners to understand how RL models work. However, XAI for users who do not have RL expertise (non-RL experts), has not been studied sufficiently. This results in a difficulty for the non-RL experts to participate in the fundamental discussion of how RL models should be designed for an incoming society where humans and AI coexist. Solving such a problem would enable RL experts to communicate with the non-RL experts in producing machine learning solutions that better fit our society. We argue that abstracted trajectories, that depicts transitions between the major states of the RL model, will be useful for non-RL experts to build a mental model of the agents. Our early results suggest that by leveraging a visualization of the abstracted trajectories, users without RL expertise are able to infer the behavior patterns of RL.
