Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming

Manuel Eberhardinger; Florian Rupp; Johannes Maucher; Setareh Maghsudi

Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming

Manuel Eberhardinger, Florian Rupp, Johannes Maucher, Setareh Maghsudi

TL;DR

The paper tackles the opacity of reinforcement learning decisions by introducing a genetic programming framework that imitates a trained agent with interpretable, executable programs encoded in a typed Lisp-like DSL. It integrates a library-learning module (via Stitch) to extract reusable abstractions, and employs curriculum-guided GP with bloat control to produce concise explanations. Across a grid-world maze domain, the GP approach achieves competitive or superior accuracy for longer decision sequences while using substantially less hardware and computation time than state-of-the-art program-synthesis baselines. This work demonstrates that GP can offer scalable, verifiable explainability for RL policies with practical resource efficiency, and it provides an open-source implementation for further research.

Abstract

Despite tremendous progress, machine learning and deep learning still suffer from incomprehensible predictions. Incomprehensibility, however, is not an option for the use of (deep) reinforcement learning in the real world, as unpredictable actions can seriously harm the involved individuals. In this work, we propose a genetic programming framework to generate explanations for the decision-making process of already trained agents by imitating them with programs. Programs are interpretable and can be executed to generate explanations of why the agent chooses a particular action. Furthermore, we conduct an ablation study that investigates how extending the domain-specific language by using library learning alters the performance of the method. We compare our results with the previous state of the art for this problem and show that we are comparable in performance but require much less hardware resources and computation time.

Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 6 figures, 2 tables)

This paper contains 18 sections, 3 equations, 6 figures, 2 tables.

Introduction
Related work
Background
Program and Domain-specific Language
Program Synthesis with Library Learning for Reinforcement Learning
Case Study: Drawbacks of previous Method
Methodology
Genetic Programming
Library Learning
Experiments
Domain
Evaluation
Runtime Improvements
Accuracy
Library Learning
...and 3 more sections

Figures (6)

Figure 1: The overview of the problem setting. We train an agent to sample state-action pairs from the learned policy. These examples are then imitated with the genetic programming algorithm to generate explanations for the agent's decision-making process.
Figure 2: The abstract syntax tree for the example program from Listing \ref{['lst:listing']}.
Figure 3: The medium sized perfect maze environment which is used to evaluate the proposed method.
Figure 4: The evaluation of the genetic programming algorithm compared to the baseline methods. The accuracy is displayed on the y-axis and the the sequence length on the x-axis.
Figure 5: The difference between using library learning with genetic programming and without.
...and 1 more figures

Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming

TL;DR

Abstract

Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming

Authors

TL;DR

Abstract

Table of Contents

Figures (6)