Table of Contents
Fetching ...

Policy Compatible Skill Incremental Learning via Lazy Learning Interface

Daehee Lee, Dongsu Lee, TaeYoon Kwack, Wonje Choi, Honguk Woo

TL;DR

SIL-C tackles the core problem of maintaining compatibility between evolving skill libraries and downstream hierarchical policies in Skill Incremental Learning. It introduces a bilateral lazy learning interface that aligns the subtask space of high-level policies with the skill space of low-level decoders by matching trajectory distributions, enabling forward and backward compatibility without re-training. The approach uses append-only prototype memories and a two-stage instance-based matching process (validation followed by hooking) to map subtasks to executable skills at inference time, improving sample efficiency and modularity across diverse SIL scenarios. Empirical results in Franka Kitchen and Meta-World demonstrate superior skill-policy compatibility (higher AUC) and robust performance under noise and limited supervision, highlighting SIL-C’s potential for scalable, lifelong robotic learning. Overall, SIL-C enables true compositional learning where new skills enhance existing policies without full policy re-training, supporting safer, more scalable embodied agents.

Abstract

Skill Incremental Learning (SIL) is the process by which an embodied agent expands and refines its skill set over time by leveraging experience gained through interaction with its environment or by the integration of additional data. SIL facilitates efficient acquisition of hierarchical policies grounded in reusable skills for downstream tasks. However, as the skill repertoire evolves, it can disrupt compatibility with existing skill-based policies, limiting their reusability and generalization. In this work, we propose SIL-C, a novel framework that ensures skill-policy compatibility, allowing improvements in incrementally learned skills to enhance the performance of downstream policies without requiring policy re-training or structural adaptation. SIL-C employs a bilateral lazy learning-based mapping technique to dynamically align the subtask space referenced by policies with the skill space decoded into agent behaviors. This enables each subtask, derived from the policy's decomposition of a complex task, to be executed by selecting an appropriate skill based on trajectory distribution similarity. We evaluate SIL-C across diverse SIL scenarios and demonstrate that it maintains compatibility between evolving skills and downstream policies while ensuring efficiency throughout the learning process.

Policy Compatible Skill Incremental Learning via Lazy Learning Interface

TL;DR

SIL-C tackles the core problem of maintaining compatibility between evolving skill libraries and downstream hierarchical policies in Skill Incremental Learning. It introduces a bilateral lazy learning interface that aligns the subtask space of high-level policies with the skill space of low-level decoders by matching trajectory distributions, enabling forward and backward compatibility without re-training. The approach uses append-only prototype memories and a two-stage instance-based matching process (validation followed by hooking) to map subtasks to executable skills at inference time, improving sample efficiency and modularity across diverse SIL scenarios. Empirical results in Franka Kitchen and Meta-World demonstrate superior skill-policy compatibility (higher AUC) and robust performance under noise and limited supervision, highlighting SIL-C’s potential for scalable, lifelong robotic learning. Overall, SIL-C enables true compositional learning where new skills enhance existing policies without full policy re-training, supporting safer, more scalable embodied agents.

Abstract

Skill Incremental Learning (SIL) is the process by which an embodied agent expands and refines its skill set over time by leveraging experience gained through interaction with its environment or by the integration of additional data. SIL facilitates efficient acquisition of hierarchical policies grounded in reusable skills for downstream tasks. However, as the skill repertoire evolves, it can disrupt compatibility with existing skill-based policies, limiting their reusability and generalization. In this work, we propose SIL-C, a novel framework that ensures skill-policy compatibility, allowing improvements in incrementally learned skills to enhance the performance of downstream policies without requiring policy re-training or structural adaptation. SIL-C employs a bilateral lazy learning-based mapping technique to dynamically align the subtask space referenced by policies with the skill space decoded into agent behaviors. This enables each subtask, derived from the policy's decomposition of a complex task, to be executed by selecting an appropriate skill based on trajectory distribution similarity. We evaluate SIL-C across diverse SIL scenarios and demonstrate that it maintains compatibility between evolving skills and downstream policies while ensuring efficiency throughout the learning process.

Paper Structure

This paper contains 41 sections, 13 equations, 10 figures, 17 tables, 2 algorithms.

Figures (10)

  • Figure 1: Overview of policy-compatible skill incremental learning and SIL-C framework
  • Figure 2: Overview of the SIL-C framework: components, updates, and integration
  • Figure 3: (Top) SIL scenario types and (Bottom) evaluation groups
  • Figure 4: Results on Kitchen and Meta-World tasks under Emergent and Explicit skill incremental scenarios. Each group of bars represents a skill update phase ($x$-axis), and the $y$-axis shows the scaled reward. Darker bars represent evaluation with the initial policies. Lighter bars show performance after re-training with updated skills. Diamonds (◇) indicate the application of SIL-C to each baseline. The skill decoder uses the AA strategy for Kitchen and ER for Meta-World.
  • Figure 5: Ablation of the SIL-C lazy learning interface in the Kitchen Emergent SIL scenario using PTGM with AA configuration. (Left) Per-phase performance across three settings. (Right) Evaluation trajectories example illustrating selected subtasks and corresponding skills used to solve the task: open microwave$\rightarrow$move kettle$\rightarrow$turn on top burner$\rightarrow$open hinge cabinet.
  • ...and 5 more figures