Table of Contents
Fetching ...

Understanding and Diagnosing Deep Reinforcement Learning

Ezgi Korkmaz

TL;DR

The results reveal the fundamental properties of the decision process made by reinforcement learning policies, and can help in constructing reliable and robust deep neural policies.

Abstract

Deep neural policies have recently been installed in a diverse range of settings, from biotechnology to automated financial systems. However, the utilization of deep neural networks to approximate the value function leads to concerns on the decision boundary stability, in particular, with regard to the sensitivity of policy decision making to indiscernible, non-robust features due to highly non-convex and complex deep neural manifolds. These concerns constitute an obstruction to understanding the reasoning made by deep neural policies, and their foundational limitations. Hence, it is crucial to develop techniques that aim to understand the sensitivities in the learnt representations of neural network policies. To achieve this we introduce a theoretically founded method that provides a systematic analysis of the unstable directions in the deep neural policy decision boundary across both time and space. Through experiments in the Arcade Learning Environment (ALE), we demonstrate the effectiveness of our technique for identifying correlated directions of instability, and for measuring how sample shifts remold the set of sensitive directions in the neural policy landscape. Most importantly, we demonstrate that state-of-the-art robust training techniques yield learning of disjoint unstable directions, with dramatically larger oscillations over time, when compared to standard training. We believe our results reveal the fundamental properties of the decision process made by reinforcement learning policies, and can help in constructing reliable and robust deep neural policies.

Understanding and Diagnosing Deep Reinforcement Learning

TL;DR

The results reveal the fundamental properties of the decision process made by reinforcement learning policies, and can help in constructing reliable and robust deep neural policies.

Abstract

Deep neural policies have recently been installed in a diverse range of settings, from biotechnology to automated financial systems. However, the utilization of deep neural networks to approximate the value function leads to concerns on the decision boundary stability, in particular, with regard to the sensitivity of policy decision making to indiscernible, non-robust features due to highly non-convex and complex deep neural manifolds. These concerns constitute an obstruction to understanding the reasoning made by deep neural policies, and their foundational limitations. Hence, it is crucial to develop techniques that aim to understand the sensitivities in the learnt representations of neural network policies. To achieve this we introduce a theoretically founded method that provides a systematic analysis of the unstable directions in the deep neural policy decision boundary across both time and space. Through experiments in the Arcade Learning Environment (ALE), we demonstrate the effectiveness of our technique for identifying correlated directions of instability, and for measuring how sample shifts remold the set of sensitive directions in the neural policy landscape. Most importantly, we demonstrate that state-of-the-art robust training techniques yield learning of disjoint unstable directions, with dramatically larger oscillations over time, when compared to standard training. We believe our results reveal the fundamental properties of the decision process made by reinforcement learning policies, and can help in constructing reliable and robust deep neural policies.

Paper Structure

This paper contains 12 sections, 2 theorems, 14 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Proposition 3.3

Given a set of $n$ states $S = \{s_i\}_{i=1}^{n}$ define the matrix $\mathcal{L}(S)$ by Then $\mathcal{G}_S$ is the eigenvector corresponding to the largest eigenvalue of $\mathcal{L}(S)$.

Figures (5)

  • Figure 1: RA-NLD results of untransformed states and states under adversarial perturbations computed via Carlini&Wagner, Nesterov Momentum, and elastic-net regularization for Pong and BankHeist. Row1: Pong. Row2: BankHeist. Column1: Untransformed. Column2: C&W. Column3: Nesterov Momentum. Column4: Elastic-Net
  • Figure 2: Fourier spectrum of the RA-NLD of the state-of-the-art adversarially and vanilla trained deep neural policies.Row1: Adversarial. Row2: Vanilla. Column1: RoadRunner. Column2: BankHeist. Column3: Pong. Column4: Freeway
  • Figure 3: Standardized gradients $\lVert \nabla_{s_g} J(s_i,s_g) \rVert^2$ for vanilla trained and state-of-the-art certified adversarially trained deep reinforcement learning policies.
  • Figure 4: Principal non-Lipschitz direction $\mathcal{G}(i,j)$ for the state-of-the-art certified adversarially trained deep reinforcement learning policies for BankHeist, Pong, Freeway and RoadRunner.
  • Figure 5: RA-NLD results of untransformed state observations and states under natural transformations with rotation, perspective transformation, blurring, compression artifacts, and B&C for Pong.

Theorems & Definitions (8)

  • Definition 3.1: $\epsilon$-non-Lipschitz Direction
  • Definition 3.2: Principal non-Lipschitz direction
  • Proposition 3.3: Spectral characterization of principal non-Lipschitz directions
  • proof
  • Definition 3.4: Encountered set of states
  • Definition 3.5: Feature Correlation Quotient
  • Proposition 3.6: Boundedness of Feature Correlation Quotient
  • proof