Table of Contents
Fetching ...

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

TL;DR

This survey formalizes curriculum learning for reinforcement learning, framing curricula as DAGs over samples and tasks to accelerate training. It decomposes CL into task generation, sequencing, and transfer learning, and develops a seven-dimensional taxonomy to classify methods. By surveying sample-reordering, co-learning, and environment-modification approaches, it highlights open problems including fully automated task creation, sim-to-real transfer, and theoretical guarantees. The work provides a consolidated foundation and guidance for researchers pursuing efficient, scalable curricula in RL and related domains.

Abstract

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

TL;DR

This survey formalizes curriculum learning for reinforcement learning, framing curricula as DAGs over samples and tasks to accelerate training. It decomposes CL into task generation, sequencing, and transfer learning, and develops a seven-dimensional taxonomy to classify methods. By surveying sample-reordering, co-learning, and environment-modification approaches, it highlights open problems including fully automated task creation, sim-to-real transfer, and theoretical guarantees. The work provides a consolidated foundation and guidance for researchers pursuing efficient, scalable curricula in RL and related domains.

Abstract

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.

Paper Structure

This paper contains 31 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Different subgames in the game of Quick Chess, which are used to form a curriculum for learning the full game of Chess.
  • Figure 2: Performance metrics for transfer learning using (a) weak transfer and (b) strong transfer with offset curves.
  • Figure 3: Examples of structures of curricula from previous work. (a) Linear sequences in a gridworld domain narvekar2017autonomous (b) Directed acyclic graphs in block dude svetlik2017automatic.
  • Figure 4: One example of curricula designed by human users. (a) Given final task. (b) A curriculum designed by one human participant.

Theorems & Definitions (5)

  • Definition 1
  • Definition 2: Curriculum
  • Definition 3: Single-task Curriculum
  • Definition 4: Task-level Curriculum
  • Definition 5: Sequence Curriculum