Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning

Sanghyeon Lee; Sangjun Bae; Yisak Park; Seungyul Han

Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning

Sanghyeon Lee, Sangjun Bae, Yisak Park, Seungyul Han

TL;DR

SISL tackles the instability of skill-based meta-RL under noisy offline data by decoupling a high-level policy from a dedicated skill-improvement policy and by prioritizing updates on task-relevant trajectories through maximum return relabeling. A reward-model guided relabeling scheme concentrates learning on promising offline samples, while an online improvement process progressively denoisess the skill library. Empirical results across Kitchen, Office, Maze2D, and AntMaze demonstrate that SISL robustly outperforms existing skill-based meta-RL approaches, especially as demonstration noise increases, with an acceptable computational overhead. The work offers a practical, data-efficient path to robust long-horizon meta-learning in real-world settings where data quality cannot be guaranteed.

Abstract

Meta-reinforcement learning (Meta-RL) facilitates rapid adaptation to unseen tasks but faces challenges in long-horizon environments. Skill-based approaches tackle this by decomposing state-action sequences into reusable skills and employing hierarchical decision-making. However, these methods are highly susceptible to noisy offline demonstrations, leading to unstable skill learning and degraded performance. To address this, we propose Self-Improving Skill Learning (SISL), which performs self-guided skill refinement using decoupled high-level and skill improvement policies, while applying skill prioritization via maximum return relabeling to focus updates on task-relevant trajectories, resulting in robust and stable adaptation even under noisy and suboptimal data. By mitigating the effect of noise, SISL achieves reliable skill learning and consistently outperforms other skill-based meta-RL methods on diverse long-horizon tasks.

Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning

TL;DR

Abstract

Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (21)