Table of Contents
Fetching ...

Automatic Curriculum Learning For Deep RL: A Short Survey

Rémy Portelas, Cédric Colas, Lilian Weng, Katja Hofmann, Pierre-Yves Oudeyer

TL;DR

This short survey formalizes Automatic Curriculum Learning (ACL) for Deep Reinforcement Learning and presents a three-dimensional typology (why use ACL, what ACL controls, what ACL optimizes). It distinguishes data-collection and data-exploitation roles of ACL, detailing controls over initial states, rewards, goals, environments, opponents, transitions, and substituted goals, with concrete examples such as PER, HER, GoalGAN, CURIOUS, and PCG-based curricula. The paper surveys surrogate objectives—reward, intermediate difficulty, learning progress, diversity, surprise, energy, and adversarial reward maximization—showcasing ACL’s capacity to improve sample efficiency, solve hard tasks, and foster generalist and open-ended exploration across single- and multi-goal DRL problems. It also discusses the need for systematic benchmarks, theoretical foundations, and cross-disciplinary integration to advance ACL toward robust, open-ended learning agents.

Abstract

Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.

Automatic Curriculum Learning For Deep RL: A Short Survey

TL;DR

This short survey formalizes Automatic Curriculum Learning (ACL) for Deep Reinforcement Learning and presents a three-dimensional typology (why use ACL, what ACL controls, what ACL optimizes). It distinguishes data-collection and data-exploitation roles of ACL, detailing controls over initial states, rewards, goals, environments, opponents, transitions, and substituted goals, with concrete examples such as PER, HER, GoalGAN, CURIOUS, and PCG-based curricula. The paper surveys surrogate objectives—reward, intermediate difficulty, learning progress, diversity, surprise, energy, and adversarial reward maximization—showcasing ACL’s capacity to improve sample efficiency, solve hard tasks, and foster generalist and open-ended exploration across single- and multi-goal DRL problems. It also discusses the need for systematic benchmarks, theoretical foundations, and cross-disciplinary integration to advance ACL toward robust, open-ended learning agents.

Abstract

Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.

Paper Structure

This paper contains 35 sections, 1 equation, 1 figure, 1 table.

Figures (1)

  • Figure 1: ACL for data collection. ACL can control each elements of task MDPs to shape the learning trajectories of agents. Given metrics of the agent's behavior like performance or visited states, ACL methods generate new tasks adapted to the agent's abilities.