Towards Continual Reinforcement Learning: A Review and Perspectives
Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup
TL;DR
The paper surveys continual reinforcement learning by formalizing non-stationarity through a two-axis taxonomy (scope and driver) and offering a unifying view that generalizes existing CRL formulations. It categorizes methods into explicit knowledge retention, shared-structure approaches, and meta-learning, detailing representative techniques like rehearsal, distillation, modular architectures, state abstractions, goals, and auxiliary tasks. It also covers evaluation practices, benchmarks, and robust metrics for forward/backward transfer and skill reuse, arguing for richer, principled CRL benchmarks. Finally, it discusses neuroscience-inspired directions and open problems needed to bridge the gap between current CRL methods and real-world deployment.
Abstract
In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations by mathematically characterizing two key properties of non-stationarity, namely, the scope and driver non-stationarity. This offers a unified view of various formulations. Next, we review and present a taxonomy of continual RL approaches. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance. Finally, we highlight open problems and challenges in bridging the gap between the current state of continual RL and findings in neuroscience. While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners that can function in increasingly realistic applications where non-stationarity plays a vital role. These include applications such as those in the fields of healthcare, education, logistics, and robotics.
