Table of Contents
Fetching ...

Projected Task-Specific Layers for Multi-Task Reinforcement Learning

Josselin Somerville Roberts, Julia Di

TL;DR

This work tackles the challenge of generalizing across related robotic manipulation tasks in multi-task reinforcement learning by addressing task interference with a novel architecture, Projected Task-Specific Layers (PTSL). PTSL combines a large shared backbone with low-rank, task-specific corrections and uses projections to blend shared and task-specific representations, optionally atop the CARE encoder. Empirically, PTSL achieves state-of-the-art performance on Meta-World MT10 and MT50, delivering faster convergence and improved sample efficiency, including notable gains when integrated with CARE. Ablation studies show the benefits of a shared projection and the influence of residual configurations, underscoring the value of structured parameter sharing for scalable multi-task robotics.

Abstract

Multi-task reinforcement learning could enable robots to scale across a wide variety of manipulation tasks in homes and workplaces. However, generalizing from one task to another and mitigating negative task interference still remains a challenge. Addressing this challenge by successfully sharing information across tasks will depend on how well the structure underlying the tasks is captured. In this work, we introduce our new architecture, Projected Task-Specific Layers (PTSL), that leverages a common policy with dense task-specific corrections through task-specific layers to better express shared and variable task information. We then show that our model outperforms the state of the art on the MT10 and MT50 benchmarks of Meta-World consisting of 10 and 50 goal-conditioned tasks for a Sawyer arm.

Projected Task-Specific Layers for Multi-Task Reinforcement Learning

TL;DR

This work tackles the challenge of generalizing across related robotic manipulation tasks in multi-task reinforcement learning by addressing task interference with a novel architecture, Projected Task-Specific Layers (PTSL). PTSL combines a large shared backbone with low-rank, task-specific corrections and uses projections to blend shared and task-specific representations, optionally atop the CARE encoder. Empirically, PTSL achieves state-of-the-art performance on Meta-World MT10 and MT50, delivering faster convergence and improved sample efficiency, including notable gains when integrated with CARE. Ablation studies show the benefits of a shared projection and the influence of residual configurations, underscoring the value of structured parameter sharing for scalable multi-task robotics.

Abstract

Multi-task reinforcement learning could enable robots to scale across a wide variety of manipulation tasks in homes and workplaces. However, generalizing from one task to another and mitigating negative task interference still remains a challenge. Addressing this challenge by successfully sharing information across tasks will depend on how well the structure underlying the tasks is captured. In this work, we introduce our new architecture, Projected Task-Specific Layers (PTSL), that leverages a common policy with dense task-specific corrections through task-specific layers to better express shared and variable task information. We then show that our model outperforms the state of the art on the MT10 and MT50 benchmarks of Meta-World consisting of 10 and 50 goal-conditioned tasks for a Sawyer arm.
Paper Structure (18 sections, 5 figures, 5 tables)

This paper contains 18 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Simplified diagram of different architectures for multi-task reinforcement learning: Shared backbone for all tasks (left), Individual backbone for each task (center) and Projected Task-Specific Layers (ours)(right)
  • Figure 2: PTSL Architecture (ours), explained in Section \ref{['sec:ptsl']} for details. The dotted red lines represent residual connections (not always present). See Section \ref{['sec:residuals']} for details. Projection modules that are reused are represented with the same color. See Section \ref{['sec:projection']} for details.
  • Figure 3: The MT10 benchmark from Meta-World contains 10 tasks: reach, push, pick and place, open door, open drawer, close drawer, press button top-down, insert peg side, open window, and open box.
  • Figure 4: Training curves of different methods on all benchmarks. For MT10, PTSL converges faster than baselines, and for MT50, we see a gain in sample efficiency. The bolded line represents the mean over $n=10$ runs for the short horizon and $n=4$ for the long horizon. The shaded area represents the standard error.
  • Figure 5: Training curves of different methods on all benchmarks. The bolded line represents the mean over $n=10$ runs for the short horizon and $n=4$ for the long horizon. The shaded area represents the standard error.