From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL
Sahar Admoni, Assaf Hallak, Yftah Ziser, Omer Ben-Porat, Ofra Amir
TL;DR
This work reframes reinforcement learning policy interpretation as abstractive textual summarization by converting spatiotemporal trajectories into structured language via a Textual Experience Buffer and synthesizing global policy narratives with large language models. The two-stage SySLLM framework uses captioners to produce language-grounded traces and a hierarchical, consensus-based abstractive module to generate faithful, scalable summaries of long-horizon policies. Through extensive experiments on MiniGrid and Crafter, expert-alignment evaluations, and a large user study, the authors demonstrate that SySLLM produces faithful, high-coverage summaries that are preferred over demonstration-based explanations and can generalize across diverse policies without task-specific fine-tuning. The results support abstractive-textual policy summarization as a practical paradigm for interpretable RL, with potential for interactive querying and cross-domain applicability. The approach meaningfully advances interpretability by delivering global, human-readable narratives that capture exploration style, goals, and decision patterns, grounded in empirical evidence from agent traces.
Abstract
Explaining reinforcement learning agents is challenging because policies emerge from complex reward structures and neural representations that are difficult for humans to interpret. Existing approaches often rely on curated demonstrations that expose local behaviors but provide limited insight into an agent's global strategy, leaving users to infer intent from raw observations. We propose SySLLM (Synthesized Summary using Large Language Models), a framework that reframes policy interpretation as a language-generation problem. Instead of visual demonstrations, SySLLM converts spatiotemporal trajectories into structured text and prompts an LLM to generate coherent summaries describing the agent's goals, exploration style, and decision patterns. SySLLM scales to long-horizon, semantically rich environments without task-specific fine-tuning, leveraging LLM world knowledge and compositional reasoning to capture latent behavioral structure across policies. Expert evaluations show strong alignment with human analyses, and a large-scale user study found that 75.5% of participants preferred SySLLM summaries over state-of-the-art demonstration-based explanations. Together, these results position abstractive textual summarization as a paradigm for interpreting complex RL behavior.
