Table of Contents
Fetching ...

Safety-Oriented Pruning and Interpretation of Reinforcement Learning Policies

Dennis Gross, Helge Spieker

TL;DR

An interpretable RL method called VERINTER is introduced, which combines NN pruning with model checking to ensure interpretable RL safety and exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements.

Abstract

Pruning neural networks (NNs) can streamline them but risks removing vital parameters from safe reinforcement learning (RL) policies. We introduce an interpretable RL method called VERINTER, which combines NN pruning with model checking to ensure interpretable RL safety. VERINTER exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements. This method maintains safety in pruned RL policies and enhances understanding of their safety dynamics, which has proven effective in multiple RL settings.

Safety-Oriented Pruning and Interpretation of Reinforcement Learning Policies

TL;DR

An interpretable RL method called VERINTER is introduced, which combines NN pruning with model checking to ensure interpretable RL safety and exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements.

Abstract

Pruning neural networks (NNs) can streamline them but risks removing vital parameters from safe reinforcement learning (RL) policies. We introduce an interpretable RL method called VERINTER, which combines NN pruning with model checking to ensure interpretable RL safety. VERINTER exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements. This method maintains safety in pruned RL policies and enhances understanding of their safety dynamics, which has proven effective in multiple RL settings.
Paper Structure (14 sections, 2 figures, 1 table)

This paper contains 14 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Pruning methods across three Taxi events. The x-axis shows the percentage of pruned weights in the input layer, and the y-axis indicates the reachability probability of specified events (in brackets). Random pruning sample size: 10.
  • Figure 2: Each subfigure shows safety measurements for different NN layers, with the x-axis representing the percentage of pruned connections and the y-axis showing safety outcomes in the Taxi environment.

Theorems & Definitions (2)

  • Definition 1: MDP
  • Definition 2: Policy