Table of Contents
Fetching ...

Task Scheduling & Forgetting in Multi-Task Reinforcement Learning

Marc Speckmann, Theresa Eimer

TL;DR

This work investigates forgetting in multi-task reinforcement learning and whether human-inspired forgetting prevention strategies transfer to RL. It analyzes forgetting curves in RL by switching between tasks and tests Leitner and SuperMemo against Prioritized Level Replay (PLR) for curriculum generation on MiniGrid with PPO. The results show RL can exhibit forgetting and relearning patterns similar to humans, but spacing-based schedules do not universally outperform baselines, due to asymmetric task retention and complex task interactions. The findings highlight the need for curricula that capture inter-task relationships to improve efficiency in multi-task RL.

Abstract

Reinforcement learning (RL) agents can forget tasks they have previously been trained on. There is a rich body of work on such forgetting effects in humans. Therefore we look for commonalities in the forgetting behavior of humans and RL agents across tasks and test the viability of forgetting prevention measures from learning theory in RL. We find that in many cases, RL agents exhibit forgetting curves similar to those of humans. Methods like Leitner or SuperMemo have been shown to be effective at counteracting human forgetting, but we demonstrate they do not transfer as well to RL. We identify a likely cause: asymmetrical learning and retention patterns between tasks that cannot be captured by retention-based or performance-based curriculum strategies.

Task Scheduling & Forgetting in Multi-Task Reinforcement Learning

TL;DR

This work investigates forgetting in multi-task reinforcement learning and whether human-inspired forgetting prevention strategies transfer to RL. It analyzes forgetting curves in RL by switching between tasks and tests Leitner and SuperMemo against Prioritized Level Replay (PLR) for curriculum generation on MiniGrid with PPO. The results show RL can exhibit forgetting and relearning patterns similar to humans, but spacing-based schedules do not universally outperform baselines, due to asymmetric task retention and complex task interactions. The findings highlight the need for curricula that capture inter-task relationships to improve efficiency in multi-task RL.

Abstract

Reinforcement learning (RL) agents can forget tasks they have previously been trained on. There is a rich body of work on such forgetting effects in humans. Therefore we look for commonalities in the forgetting behavior of humans and RL agents across tasks and test the viability of forgetting prevention measures from learning theory in RL. We find that in many cases, RL agents exhibit forgetting curves similar to those of humans. Methods like Leitner or SuperMemo have been shown to be effective at counteracting human forgetting, but we demonstrate they do not transfer as well to RL. We identify a likely cause: asymmetrical learning and retention patterns between tasks that cannot be captured by retention-based or performance-based curriculum strategies.

Paper Structure

This paper contains 4 sections, 6 figures.

Figures (6)

  • Figure 1: Mean evaluation reward of the SimpleCrossing task showing decreasing forgetting
  • Figure 2: Mean evaluation reward of the SimpleCrossing task showing periodic forgetting
  • Figure 3: Normalized mean evaluation rewards of all curricula; #runs=10
  • Figure 4: Mean evaluation reward of all curricula per task; #runs=10
  • Figure 5: Crosstraining Unlock (left) and DoorKey (right) with Empty while evaluating both.
  • ...and 1 more figures