Table of Contents
Fetching ...

'Explaining RL Decisions with Trajectories': A Reproducibility Study

Karim Abdel Sadek, Matteo Nulli, Joan Velja, Jort Vincenti

TL;DR

This work investigates the reproducibility of the paper 'Explaining RL decisions with trajectories', and concludes that, while some of the claims can be supported, further investigations and experiments could be of interest.

Abstract

This work investigates the reproducibility of the paper 'Explaining RL decisions with trajectories'. The original paper introduces a novel approach in explainable reinforcement learning based on the attribution decisions of an agent to specific clusters of trajectories encountered during training. We verify the main claims from the paper, which state that (i) training on less trajectories induces a lower initial state value, (ii) trajectories in a cluster present similar high-level patterns, (iii) distant trajectories influence the decision of an agent, and (iv) humans correctly identify the attributed trajectories to the decision of the agent. We recover the environments used by the authors based on the partial original code they provided for one of the environments (Grid-World), and implemented the remaining from scratch (Seaquest, HalfCheetah, Breakout and Q*Bert). While we confirm that (i), (ii), and (iii) partially hold, we extend on the largely qualitative experiments from the authors by introducing a quantitative metric to further support (iii), and new experiments and visual results for (i). Moreover, we investigate the use of different clustering algorithms and encoder architectures to further support (ii). We could not support (iv), given the limited extent of the original experiments. We conclude that, while some of the claims can be supported, further investigations and experiments could be of interest. We recognise the novelty of the work from the authors and hope that our work paves the way for clearer and more transparent approaches.

'Explaining RL Decisions with Trajectories': A Reproducibility Study

TL;DR

This work investigates the reproducibility of the paper 'Explaining RL decisions with trajectories', and concludes that, while some of the claims can be supported, further investigations and experiments could be of interest.

Abstract

This work investigates the reproducibility of the paper 'Explaining RL decisions with trajectories'. The original paper introduces a novel approach in explainable reinforcement learning based on the attribution decisions of an agent to specific clusters of trajectories encountered during training. We verify the main claims from the paper, which state that (i) training on less trajectories induces a lower initial state value, (ii) trajectories in a cluster present similar high-level patterns, (iii) distant trajectories influence the decision of an agent, and (iv) humans correctly identify the attributed trajectories to the decision of the agent. We recover the environments used by the authors based on the partial original code they provided for one of the environments (Grid-World), and implemented the remaining from scratch (Seaquest, HalfCheetah, Breakout and Q*Bert). While we confirm that (i), (ii), and (iii) partially hold, we extend on the largely qualitative experiments from the authors by introducing a quantitative metric to further support (iii), and new experiments and visual results for (i). Moreover, we investigate the use of different clustering algorithms and encoder architectures to further support (ii). We could not support (iv), given the limited extent of the original experiments. We conclude that, while some of the claims can be supported, further investigations and experiments could be of interest. We recognise the novelty of the work from the authors and hope that our work paves the way for clearer and more transparent approaches.

Paper Structure

This paper contains 29 sections, 2 equations, 13 figures, 10 tables.

Figures (13)

  • Figure 1: Trajectory attribution process by deshmukh2023explaining
  • Figure 2: Correlation between Action Value and the Cluster Attribution Frequency. (i) The plot obtained using the DBSCAN algorithms shows a (weak) correlation of the action value with the attribution frequency of a cluster. We clearly observe that Cluster 1, which was the one attributed more often, is of crucial importance. (ii) The plot obtained using XMeans clearly shows the phenomena of Claim 2. There is a clear negative correlation between the two quantities, which highlights the importance of data trajectories. Again, the cluster attributed to most agent decisions, i.e. Cluster 7, constitutes a fundamental portion of the training data that leads to a high-value policy.
  • Figure 3: Reproducing and verifying claim Cluster High-Level Behaviours in Grid-World. Cluster 1 showcases the presence of behaviour 'Achieving Goal in top right corner'. Cluster 6 of 'Mid-grid journey to goal' and cluster 2 of 'Falling into lava'. Three High-Level Behaviours found match those highlighted by the authors.
  • Figure 4: Clustering differences in Seaquest and HalfCheetah: This figure contrasts the clustering outcomes between our study and the original paper. Figure (a) and (c) illustrate the clusters of the authors for Seaquest and HalfCheetah, while figure (b) and (d) reflect our observations, revealing significant differences in distribution and amount of data points. These discrepancies may highlight the influence of game mode choices, dataset specifics, and data aggregation techniques on clustering outcomes.
  • Figure 4: Summary: Reproduced Results per Game for Each Claim. A check-mark represents validated results, an x-mark denotes an invalidated statement for the specific game, and question-mark indicates that we cannot confirm or deny the claim of the authors for this specific game. This may arise from time constraints or because the claim itself lacks sufficient precision, making it impossible to definitively confirm or refute even with additional experimentation.
  • ...and 8 more figures