Table of Contents
Fetching ...

Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning

Po-Shao Lin, Jia-Fong Yeh, Yi-Ting Chen, Winston H. Hsu

TL;DR

STARS targets the persistent issue of performance imbalance in multi-task reinforcement learning by integrating a shared-unique feature extractor with a task-aware prioritized sampling strategy. The shared-unique extractor couples knowledge sharing (via a shared parameter pool) with task-specific detail (via triplet-guided embeddings), while TaPS dynamically allocates training samples across tasks based on both transition TD-errors and per-task priorities. Empirical results on Meta-World MT-10 show STARS achieving a statistically significant improvement over prior SOTA methods (about 6.5% on average) and reduced performance variability across tasks, with ablations and feature visualizations supporting the design choices. Although MT-50 results are more modest, the approach demonstrates clear advantages in knowledge synergy and training resource allocation, highlighting practical gains for scalable, robust multi-task RL.

Abstract

We observe that current state-of-the-art (SOTA) methods suffer from the performance imbalance issue when performing multi-task reinforcement learning (MTRL) tasks. While these methods may achieve impressive performance on average, they perform extremely poorly on a few tasks. To address this, we propose a new and effective method called STARS, which consists of two novel strategies: a shared-unique feature extractor and task-aware prioritized sampling. First, the shared-unique feature extractor learns both shared and task-specific features to enable better synergy of knowledge between different tasks. Second, the task-aware sampling strategy is combined with the prioritized experience replay for efficient learning on tasks with poor performance. The effectiveness and stability of our STARS are verified through experiments on the mainstream Meta-World benchmark. From the results, our STARS statistically outperforms current SOTA methods and alleviates the performance imbalance issue. Besides, we visualize the learned features to support our claims and enhance the interpretability of STARS.

Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning

TL;DR

STARS targets the persistent issue of performance imbalance in multi-task reinforcement learning by integrating a shared-unique feature extractor with a task-aware prioritized sampling strategy. The shared-unique extractor couples knowledge sharing (via a shared parameter pool) with task-specific detail (via triplet-guided embeddings), while TaPS dynamically allocates training samples across tasks based on both transition TD-errors and per-task priorities. Empirical results on Meta-World MT-10 show STARS achieving a statistically significant improvement over prior SOTA methods (about 6.5% on average) and reduced performance variability across tasks, with ablations and feature visualizations supporting the design choices. Although MT-50 results are more modest, the approach demonstrates clear advantages in knowledge synergy and training resource allocation, highlighting practical gains for scalable, robust multi-task RL.

Abstract

We observe that current state-of-the-art (SOTA) methods suffer from the performance imbalance issue when performing multi-task reinforcement learning (MTRL) tasks. While these methods may achieve impressive performance on average, they perform extremely poorly on a few tasks. To address this, we propose a new and effective method called STARS, which consists of two novel strategies: a shared-unique feature extractor and task-aware prioritized sampling. First, the shared-unique feature extractor learns both shared and task-specific features to enable better synergy of knowledge between different tasks. Second, the task-aware sampling strategy is combined with the prioritized experience replay for efficient learning on tasks with poor performance. The effectiveness and stability of our STARS are verified through experiments on the mainstream Meta-World benchmark. From the results, our STARS statistically outperforms current SOTA methods and alleviates the performance imbalance issue. Besides, we visualize the learned features to support our claims and enhance the interpretability of STARS.
Paper Structure (31 sections, 9 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 31 sections, 9 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: Performance imbalance issue. (a) Selected tasks from the Meta-World benchmark yu2020meta. Tasks like door-open and drawer-open share some related skills while peg-insert-side may contains more unique skills. (b) Performance comparison between ours and previous MTRL methods. The results are averaged across tasks and the colored area is the standard deviation (STD). The larger STD values indicate that the more severe the method suffers from the performance imbalance issue. Our method achieves the best average performance and consistently maintains a lower standard deviation.
  • Figure 2: Architecture of STARS. Our STARS consists of two main components: a share-unique feature encoder and a task-aware sampling strategy. With these two components, STARS enhances knowledge synergy between tasks and focus on tasks with poor learning outcomes (we refer to the overview paragraph in Section \ref{['sec:method']} for more details).
  • Figure 3: t-SNE visualization of the learned unique features of our STARS on the MT-10 track.
  • Figure 4: Performance Imbalance Issue (per task). We have presented the issue with the average success rate (SR) across tasks in Figure \ref{['fig:issue']} and Table \ref{['tab:result']} of the main paper. Here, we provide the average SR (already averaged over 10 runs) for each task. Our STARS method effectively focuses on the more difficult tasks and demonstrates the best performance stability among the compared methods.
  • Figure 5: t-SNE visualization of the learned unique features of our STARS on the MT-50 track.