Hyperparameter Optimisation with Practical Interpretability and Explanation Methods in Probabilistic Curriculum Learning
Llewyn Salt, Marcus Gallagher
TL;DR
The paper tackles the challenge of hyperparameter optimization in reinforcement learning by focusing on Probabilistic Curriculum Learning (PCL) and evaluating how hyperparameters interact using the AlgOS framework with Optuna's Tree-Structured Parzen Estimator. It introduces a SHAP-based interpretability method to attribute performance effects to hyperparameters and their interactions, and validates these insights on point-maze navigation and DC motor control with SAC as the agent. Key contributions include practical guidelines for refining hyperparameter bounds, a novel SHAP-based analysis pipeline for RL HPO, and demonstration that curriculum-related hyperparameters (notably $Q_{lower}$) interact with agent parameters to substantially impact performance. The work advances both the efficiency and interpretability of hyperparameter optimization in RL, enabling more feasible tuning and better understanding of how curricula influence learning outcomes. $f_{objective} = \frac{1}{N}\sum_N g_{success}$ is used as the evaluation metric, and results suggest concrete directions for adjusting bounds to improve optimisation outcomes.
Abstract
Hyperparameter optimisation (HPO) is crucial for achieving strong performance in reinforcement learning (RL), as RL algorithms are inherently sensitive to hyperparameter settings. Probabilistic Curriculum Learning (PCL) is a curriculum learning strategy designed to improve RL performance by structuring the agent's learning process, yet effective hyperparameter tuning remains challenging and computationally demanding. In this paper, we provide an empirical analysis of hyperparameter interactions and their effects on the performance of a PCL algorithm within standard RL tasks, including point-maze navigation and DC motor control. Using the AlgOS framework integrated with Optuna's Tree-Structured Parzen Estimator (TPE), we present strategies to refine hyperparameter search spaces, enhancing optimisation efficiency. Additionally, we introduce a novel SHAP-based interpretability approach tailored specifically for analysing hyperparameter impacts, offering clear insights into how individual hyperparameters and their interactions influence RL performance. Our work contributes practical guidelines and interpretability tools that significantly improve the effectiveness and computational feasibility of hyperparameter optimisation in reinforcement learning.
