Table of Contents
Fetching ...

Graying the black box: Understanding DQNs

Tom Zahavy, Nir Ben Zrihem, Shie Mannor

TL;DR

The paper investigates how Deep Q-Networks (DQNs) acquire internal structure from high-dimensional inputs, proposing the Semi Aggregated MDP (SAMDP) to extract spatio-temporal abstractions. It introduces manual clustering and SAMDP as tools for interpretability, debugging, and sub-goal detection, and demonstrates their utility on Gridworld and several Atari games (Breakout, Seaquest, Pacman). SAMDP combines temporal and spatial abstractions to identify options and hierarchical policy structures, enabling analysis of policy behavior, state initialization/termination handling, and score-pixel effects. The work shows that DQNs organize the state space into sub-manifolds with low-entropy transitions, offering a path toward more interpretable and potentially more efficient DRL, including shared-autonomy safeguards via eject-based interventions.

Abstract

In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

Graying the black box: Understanding DQNs

TL;DR

The paper investigates how Deep Q-Networks (DQNs) acquire internal structure from high-dimensional inputs, proposing the Semi Aggregated MDP (SAMDP) to extract spatio-temporal abstractions. It introduces manual clustering and SAMDP as tools for interpretability, debugging, and sub-goal detection, and demonstrates their utility on Gridworld and several Atari games (Breakout, Seaquest, Pacman). SAMDP combines temporal and spatial abstractions to identify options and hierarchical policy structures, enabling analysis of policy behavior, state initialization/termination handling, and score-pixel effects. The work shows that DQNs organize the state space into sub-manifolds with low-entropy transitions, offering a path toward more interpretable and potentially more efficient DRL, including shared-autonomy safeguards via eject-based interventions.

Abstract

In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

Paper Structure

This paper contains 20 sections, 8 equations, 19 figures.

Figures (19)

  • Figure 1: Graphical user interface for our methodology.
  • Figure 2: Left: Illustration of state aggregation and skills. Primitive actions (orange arrows) cause transitions between MDP states (black dots) while skills (red arrows) induce transitions between SAMDP states (blue circles). Right: Modeling approaches for analyzing policies.
  • Figure 3: State-action diagrams for a gridworld problem. a.MDP diagram: relate to individual states and primitive actions. b.SMDP diagram: Edge colors represent different skills. c.AMDP diagram: clusters are formed using spatial aggregation in the original state. d.SAMDP diagram: clusters are found after transforming the state space. intra-cluster transitions (dashed arrows) can be used to explain the skills, while inter-cluster transitions (big red arrows) loyaly explain the governing policy.
  • Figure 4: Breakout aggregated states on the t-SNE map.
  • Figure 5: Breakout tunnel digging option. Left: states that the agent visits once it entered the option clusters (1-3 in Figure \ref{['Breakout_Aggregated']}) until it finishes to carve the left tunnel are marked in red. Right: Dynamics is displayed by arrows above a 3d t-SNE map. The option termination zone is marked by a black annotation box and corresponds to carving the left tunnel. All transitions from clusters 1-3 into clusters 4-7 pass through a singular point.
  • ...and 14 more figures