Table of Contents
Fetching ...

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

Kaushik Roy, Giovanni D'urso, Nicholas Lawrance, Brendan Tidd, Peyman Moghadam

TL;DR

SPREAD is introduced, a geometry-preserving framework that employs singular value decomposition (SVD) to align policy representations across tasks within low-rank subspaces and substantially improves knowledge transfer, mitigates catastrophic forgetting, and achieves state-of-the-art performance.

Abstract

A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie task representations across sequential learning. Existing distillation methods, which rely on L2-norm feature matching in raw feature space, are sensitive to noise and high-dimensional variability, often failing to preserve intrinsic task manifolds. To address this, we introduce SPREAD, a geometry-preserving framework that employs singular value decomposition (SVD) to align policy representations across tasks within low-rank subspaces. This alignment maintains the underlying geometry of multimodal features, facilitating stable transfer, robustness, and generalization. Additionally, we propose a confidence-guided distillation strategy that applies a Kullback-Leibler divergence loss restricted to the top-M most confident action samples, emphasizing reliable modes and improving optimization stability. Experiments on the LIBERO, lifelong imitation learning benchmark, show that SPREAD substantially improves knowledge transfer, mitigates catastrophic forgetting, and achieves state-of-the-art performance.

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

TL;DR

SPREAD is introduced, a geometry-preserving framework that employs singular value decomposition (SVD) to align policy representations across tasks within low-rank subspaces and substantially improves knowledge transfer, mitigates catastrophic forgetting, and achieves state-of-the-art performance.

Abstract

A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie task representations across sequential learning. Existing distillation methods, which rely on L2-norm feature matching in raw feature space, are sensitive to noise and high-dimensional variability, often failing to preserve intrinsic task manifolds. To address this, we introduce SPREAD, a geometry-preserving framework that employs singular value decomposition (SVD) to align policy representations across tasks within low-rank subspaces. This alignment maintains the underlying geometry of multimodal features, facilitating stable transfer, robustness, and generalization. Additionally, we propose a confidence-guided distillation strategy that applies a Kullback-Leibler divergence loss restricted to the top-M most confident action samples, emphasizing reliable modes and improving optimization stability. Experiments on the LIBERO, lifelong imitation learning benchmark, show that SPREAD substantially improves knowledge transfer, mitigates catastrophic forgetting, and achieves state-of-the-art performance.
Paper Structure (13 sections, 7 equations, 7 figures, 3 tables)

This paper contains 13 sections, 7 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Geometrical interpretation of the Subspace Representation Distillation loss. $SPREAD$ maximizes similarity between consecutive LIL policies by minimizing discrepancies in projected feature representations across different input modalities within each policy’s subspace.
  • Figure 2: Overview of our proposed $SPREAD$ method. Subspace Representation Distillation aligns the latent representations from different input modality encoders (e.g., Task, AgentView, HandEye, Joint, and Gripper information), while confidence-guided policy distillation maps the action distribution of the GMM policy between incremental steps $T^{k-1}$ and $T^{k}$.
  • Figure 3: Average success rate across incremental tasks on (a) LIBERO-OBJECT and (b) LIBERO-GOAL (Higher numbers are better).
  • Figure 4: Forward Transfer (FWT) across incremental tasks on (a) LIBERO-OBJECT and (b) LIBERO-GOAL (Higher numbers are better).
  • Figure 5: Negative Backward Transfer (NBT) across incremental tasks on LIBERO-GOAL (lower is better).
  • ...and 2 more figures