Table of Contents
Fetching ...

From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL

Sahar Admoni, Assaf Hallak, Yftah Ziser, Omer Ben-Porat, Ofra Amir

TL;DR

This work reframes reinforcement learning policy interpretation as abstractive textual summarization by converting spatiotemporal trajectories into structured language via a Textual Experience Buffer and synthesizing global policy narratives with large language models. The two-stage SySLLM framework uses captioners to produce language-grounded traces and a hierarchical, consensus-based abstractive module to generate faithful, scalable summaries of long-horizon policies. Through extensive experiments on MiniGrid and Crafter, expert-alignment evaluations, and a large user study, the authors demonstrate that SySLLM produces faithful, high-coverage summaries that are preferred over demonstration-based explanations and can generalize across diverse policies without task-specific fine-tuning. The results support abstractive-textual policy summarization as a practical paradigm for interpretable RL, with potential for interactive querying and cross-domain applicability. The approach meaningfully advances interpretability by delivering global, human-readable narratives that capture exploration style, goals, and decision patterns, grounded in empirical evidence from agent traces.

Abstract

Explaining reinforcement learning agents is challenging because policies emerge from complex reward structures and neural representations that are difficult for humans to interpret. Existing approaches often rely on curated demonstrations that expose local behaviors but provide limited insight into an agent's global strategy, leaving users to infer intent from raw observations. We propose SySLLM (Synthesized Summary using Large Language Models), a framework that reframes policy interpretation as a language-generation problem. Instead of visual demonstrations, SySLLM converts spatiotemporal trajectories into structured text and prompts an LLM to generate coherent summaries describing the agent's goals, exploration style, and decision patterns. SySLLM scales to long-horizon, semantically rich environments without task-specific fine-tuning, leveraging LLM world knowledge and compositional reasoning to capture latent behavioral structure across policies. Expert evaluations show strong alignment with human analyses, and a large-scale user study found that 75.5% of participants preferred SySLLM summaries over state-of-the-art demonstration-based explanations. Together, these results position abstractive textual summarization as a paradigm for interpreting complex RL behavior.

From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL

TL;DR

This work reframes reinforcement learning policy interpretation as abstractive textual summarization by converting spatiotemporal trajectories into structured language via a Textual Experience Buffer and synthesizing global policy narratives with large language models. The two-stage SySLLM framework uses captioners to produce language-grounded traces and a hierarchical, consensus-based abstractive module to generate faithful, scalable summaries of long-horizon policies. Through extensive experiments on MiniGrid and Crafter, expert-alignment evaluations, and a large user study, the authors demonstrate that SySLLM produces faithful, high-coverage summaries that are preferred over demonstration-based explanations and can generalize across diverse policies without task-specific fine-tuning. The results support abstractive-textual policy summarization as a practical paradigm for interpretable RL, with potential for interactive querying and cross-domain applicability. The approach meaningfully advances interpretability by delivering global, human-readable narratives that capture exploration style, goals, and decision patterns, grounded in empirical evidence from agent traces.

Abstract

Explaining reinforcement learning agents is challenging because policies emerge from complex reward structures and neural representations that are difficult for humans to interpret. Existing approaches often rely on curated demonstrations that expose local behaviors but provide limited insight into an agent's global strategy, leaving users to infer intent from raw observations. We propose SySLLM (Synthesized Summary using Large Language Models), a framework that reframes policy interpretation as a language-generation problem. Instead of visual demonstrations, SySLLM converts spatiotemporal trajectories into structured text and prompts an LLM to generate coherent summaries describing the agent's goals, exploration style, and decision patterns. SySLLM scales to long-horizon, semantically rich environments without task-specific fine-tuning, leveraging LLM world knowledge and compositional reasoning to capture latent behavioral structure across policies. Expert evaluations show strong alignment with human analyses, and a large-scale user study found that 75.5% of participants preferred SySLLM summaries over state-of-the-art demonstration-based explanations. Together, these results position abstractive textual summarization as a paradigm for interpreting complex RL behavior.

Paper Structure

This paper contains 29 sections, 22 equations, 6 figures, 6 tables, 2 algorithms.

Figures (6)

  • Figure 1: Collecting the textual experience buffer (Section \ref{['sec:experience']}).
  • Figure 2: Generating global policy summaries (Section \ref{['sec:summary']}).
  • Figure 3: Four steps from a trajectory of the Resource-Collector agent in the Crafter environment, alongside their corresponding captions generated using the observation and action captioners. For each step, the captions describe the agent's inventory status, the object currently in front of it, and the next action selected by the agent. A textual representation of the visible grid (highlighted in blue) is also included to reflect the agent's local perception. Additionally, all unique achievements unlocked by the agent throughout the trajectory are summarized in red.
  • Figure 4: Insights from agents' SySLLM summaries in the MiniGrid environments.
  • Figure 5: Participant ratings for Task 1 on a 1--7 Likert scale. SySLLM ratings are significantly higher than HIGHLIGHTS ratings across all metrics.
  • ...and 1 more figures