Table of Contents
Fetching ...

Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning

Mingqi Yuan, Qi Wang, Guozheng Ma, Bo Li, Xin Jin, Yunbo Wang, Xiaokang Yang, Wenjun Zeng, Dacheng Tao

TL;DR

Plasticine addresses the challenge of plasticity loss in deep RL by providing the first open-source framework to benchmark plasticity optimization. It offers a modular architecture with four core components (Methods, Metrics, Environments, Benchmark), over 13 mitigation techniques, and a suite of 10+ evaluation metrics across three non-stationary learning levels. The work enables systematic, fair comparisons of mitigation strategies and plasticity dynamics in standard, continual, and open-ended RL settings, promoting progress toward lifelong learning agents. By integrating single-file implementations, diverse benchmarks, and comprehensive diagnostics, Plasticine supports rigorous analysis and faster development of plasticity-preserving methods for real-world, non-stationary environments.

Abstract

Developing lifelong learning agents is crucial for artificial general intelligence. However, deep reinforcement learning (RL) systems often suffer from plasticity loss, where neural networks gradually lose their ability to adapt during training. Despite its significance, this field lacks unified benchmarks and evaluation protocols. We introduce Plasticine, the first open-source framework for benchmarking plasticity optimization in deep RL. Plasticine provides single-file implementations of over 13 mitigation methods, 10 evaluation metrics, and learning scenarios with increasing non-stationarity levels from standard to open-ended environments. This framework enables researchers to systematically quantify plasticity loss, evaluate mitigation strategies, and analyze plasticity dynamics across different contexts. Our documentation, examples, and source code are available at https://github.com/RLE-Foundation/Plasticine.

Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning

TL;DR

Plasticine addresses the challenge of plasticity loss in deep RL by providing the first open-source framework to benchmark plasticity optimization. It offers a modular architecture with four core components (Methods, Metrics, Environments, Benchmark), over 13 mitigation techniques, and a suite of 10+ evaluation metrics across three non-stationary learning levels. The work enables systematic, fair comparisons of mitigation strategies and plasticity dynamics in standard, continual, and open-ended RL settings, promoting progress toward lifelong learning agents. By integrating single-file implementations, diverse benchmarks, and comprehensive diagnostics, Plasticine supports rigorous analysis and faster development of plasticity-preserving methods for real-world, non-stationary environments.

Abstract

Developing lifelong learning agents is crucial for artificial general intelligence. However, deep reinforcement learning (RL) systems often suffer from plasticity loss, where neural networks gradually lose their ability to adapt during training. Despite its significance, this field lacks unified benchmarks and evaluation protocols. We introduce Plasticine, the first open-source framework for benchmarking plasticity optimization in deep RL. Plasticine provides single-file implementations of over 13 mitigation methods, 10 evaluation metrics, and learning scenarios with increasing non-stationarity levels from standard to open-ended environments. This framework enables researchers to systematically quantify plasticity loss, evaluate mitigation strategies, and analyze plasticity dynamics across different contexts. Our documentation, examples, and source code are available at https://github.com/RLE-Foundation/Plasticine.

Paper Structure

This paper contains 57 sections, 21 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Overview of the Plasticine framework. It consists of 4 core parts: The Methods part provides high-quality and single-file implementations of diverse plasticity loss mitigation methods. The Metrics part provides comprehensive evaluation metrics to reflect the plasticity variation during training. The Environments part provides environmental configurations for different levels of continual RL scenarios, where $*$ denotes a variant designed for the continual scenario. Finally, the Benchmark part provides reusable datasets and models for facilitating corresponding research, which is constructed based on Weights & Biases.
  • Figure 2: Screenshots of well-established ALE, Procgen, DMC, and Craftax (left to right).
  • Figure 3: Maintaining plasticity in continual Procgen.
  • Figure 4: Maintaining plasticity in continual DMC.