XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

Yu Xiong; Zhipeng Hu; Ye Huang; Runze Wu; Kai Guan; Xingchen Fang; Ji Jiang; Tianze Zhou; Yujing Hu; Haoyu Liu; Tangjie Lyu; Changjie Fan

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

Yu Xiong, Zhipeng Hu, Ye Huang, Runze Wu, Kai Guan, Xingchen Fang, Ji Jiang, Tianze Zhou, Yujing Hu, Haoyu Liu, Tangjie Lyu, Changjie Fan

TL;DR

This paper presents XRL-Bench, a unified benchmark for evaluating state-explanation methods in Explainable Reinforcement Learning, addressing the lack of standardized evaluation in XRL. It defines a three-module framework (environments, state-explanation explainers, and evaluators) and supports both tabular and image state inputs, along with a novel TabularSHAP method. The authors propose objective fidelity and stability metrics and provide extensive benchmarking across multiple RL environments, revealing that TabularSHAP achieves superior fidelity on tabular data while SHAP-based gradient methods excel for image data. A real-world online gaming case study demonstrates practical utility for debugging and improving RL-driven AI bots, and the work is complemented by an open-source benchmark platform to promote reproducibility and extension. The contribution advances XRL research by delivering a repeatable, scalable evaluation framework and a competitive new explainability method with demonstrated industrial relevance.

Abstract

Reinforcement Learning (RL) has demonstrated substantial potential across diverse fields, yet understanding its decision-making process, especially in real-world scenarios where rationality and safety are paramount, is an ongoing challenge. This paper delves in to Explainable RL (XRL), a subfield of Explainable AI (XAI) aimed at unravelling the complexities of RL models. Our focus rests on state-explaining techniques, a crucial subset within XRL methods, as they reveal the underlying factors influencing an agent's actions at any given time. Despite their significant role, the lack of a unified evaluation framework hinders assessment of their accuracy and effectiveness. To address this, we introduce XRL-Bench, a unified standardized benchmark tailored for the evaluation and comparison of XRL methods, encompassing three main modules: standard RL environments, explainers based on state importance, and standard evaluators. XRL-Bench supports both tabular and image data for state explanation. We also propose TabularSHAP, an innovative and competitive XRL method. We demonstrate the practical utility of TabularSHAP in real-world online gaming services and offer an open-source benchmark platform for the straightforward implementation and evaluation of XRL methods. Our contributions facilitate the continued progression of XRL technology.

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

TL;DR

Abstract

Paper Structure (19 sections, 3 equations, 5 figures, 4 tables)

This paper contains 19 sections, 3 equations, 5 figures, 4 tables.

Introduction
RELATED WORK
Explainable RL
Evaluation Metrics for Explanations
Overview of XRL-Bench Framework
Environments, Policy Models and Datasets
Explainers
Evaluation Metrics
Benchmarking Analysis
Experimental Setup
Fidelity
Stability
Computational Efficiency
Real-World Application of XRL: A Case Study
Role of AI Bots in Online Gaming
...and 4 more sections

Figures (5)

Figure 1: The XRL-Bench framework.
Figure 2: The computational efficiency comparison of seven XRL methods.
Figure 3: The Waterfall plot for XRL episode analysis. Waterfall plot demonstrates each state's contribution in pushing the model output from its base value (the average model output over the dataset) to the final model output. States that increase the model prediction are depicted in red, while those that decrease it are in blue.
Figure 4: The Summary plot for XRL global analysis. Summary plot organizes states based on the cumulative magnitude of their SHAP values and uses these values to depict the distribution of each state's influence.
Figure 5: The Dependence plot for XRL global analysis. Dependence plot uses the SHAP value of a specific state as the y-axis, while the corresponding feature value is represented on the x-axis.

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

TL;DR

Abstract

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

Authors

TL;DR

Abstract

Table of Contents

Figures (5)