Ablation Study of How Run Time Assurance Impacts the Training and Performance of Reinforcement Learning Agents
Nathaniel Hamilton, Kyle Dunlap, Taylor T Johnson, Kerianne L Hobbs
TL;DR
The paper tackles the problem of evaluating Safe Reinforcement Learning by rigorously analyzing how Run Time Assurance (RTA) affects RL training and performance. It conducts a large-scale ablation study (880 agents, 88 configurations) across Pendulum and Spacecraft Docking (2D/3D) to compare four RTA monitoring approaches and six training configurations for both PPO (on-policy) and SAC (off-policy). Key contributions include establishing evaluation best practices, identifying baseline punishment and RTA punishment as consistently effective, showing explicit simplex as the most reliable RTA approach, and highlighting that reward shaping often matters more than safe exploration. The work provides a practical framework for fair SRL comparisons and offers guidance for deploying safe RL in real-world cyber-physical systems.
Abstract
Reinforcement Learning (RL) has become an increasingly important research area as the success of machine learning algorithms and methods grows. To combat the safety concerns surrounding the freedom given to RL agents while training, there has been an increase in work concerning Safe Reinforcement Learning (SRL). However, these new and safe methods have been held to less scrutiny than their unsafe counterparts. For instance, comparisons among safe methods often lack fair evaluation across similar initial condition bounds and hyperparameter settings, use poor evaluation metrics, and cherry-pick the best training runs rather than averaging over multiple random seeds. In this work, we conduct an ablation study using evaluation best practices to investigate the impact of run time assurance (RTA), which monitors the system state and intervenes to assure safety, on effective learning. By studying multiple RTA approaches in both on-policy and off-policy RL algorithms, we seek to understand which RTA methods are most effective, whether the agents become dependent on the RTA, and the importance of reward shaping versus safe exploration in RL agent training. Our conclusions shed light on the most promising directions of SRL, and our evaluation methodology lays the groundwork for creating better comparisons in future SRL work.
