RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning
Mingqi Yuan, Roger Creus Castanyer, Bo Li, Xin Jin, Wenjun Zeng, Glen Berseth
TL;DR
RLeXplore provides a standardized, modular framework for eight intrinsic reward methods to accelerate research in intrinsically-motivated RL. By decoupling intrinsic reward modules from RL optimization and detailing implementation nuances, it enables fair comparisons, reproducibility, and rapid integration with existing libraries. The study demonstrates that careful design choices—such as normalization, update dynamics, and memory usage—substantially affect performance, and that combining intrinsic rewards can yield emergent, high-quality exploration in sparse or reward-free settings. The framework’s open-source resources and benchmarks support broader adoption and progression toward robust, autonomous RL agents operating with minimal extrinsic supervision.
Abstract
Extrinsic rewards can effectively guide reinforcement learning (RL) agents in specific tasks. However, extrinsic rewards frequently fall short in complex environments due to the significant human effort needed for their design and annotation. This limitation underscores the necessity for intrinsic rewards, which offer auxiliary and dense signals and can enable agents to learn in an unsupervised manner. Although various intrinsic reward formulations have been proposed, their implementation and optimization details are insufficiently explored and lack standardization, thereby hindering research progress. To address this gap, we introduce RLeXplore, a unified, highly modularized, and plug-and-play framework offering reliable implementations of eight state-of-the-art intrinsic reward methods. Furthermore, we conduct an in-depth study that identifies critical implementation details and establishes well-justified standard practices in intrinsically-motivated RL. Our documentation, examples, and source code are available at https://github.com/RLE-Foundation/RLeXplore.
