Table of Contents
Fetching ...

Efficient Continual Adaptation of Pretrained Robotic Policy with Online Meta-Learned Adapters

Ruiqi Zhu, Endong Sun, Guanhe Huang, Oya Celiktutan

TL;DR

The paper tackles the problem of continual adaptation for pretrained robotic policies by enabling cross-task knowledge transfer through Online Meta-Learned Adapters (OMLA). It combines parameter-efficient adapters (LoRA) with an online meta-learning objective that learns adapter priors from previously seen tasks, using a memory-efficient, anchor-based data sampling strategy. A two-stage process first learns these priors online, then fine-tunes adapters for new tasks, enabling improved Forward Transfer with no Backward Transfer on diverse tasks and a real robot. Empirical results on LIBERO benchmarks and real Kinova experiments show robust improvements in adaptation performance, with the learned representations becoming more structured after online meta-learning, validating cross-task transfer.

Abstract

Continual adaptation is essential for general autonomous agents. For example, a household robot pretrained with a repertoire of skills must still adapt to unseen tasks specific to each household. Motivated by this, building upon parameter-efficient fine-tuning in language models, prior works have explored lightweight adapters to adapt pretrained policies, which can preserve learned features from the pretraining phase and demonstrate good adaptation performances. However, these approaches treat task learning separately, limiting knowledge transfer between tasks. In this paper, we propose Online Meta-Learned adapters (OMLA). Instead of applying adapters directly, OMLA can facilitate knowledge transfer from previously learned tasks to current learning tasks through a novel meta-learning objective. Extensive experiments in both simulated and real-world environments demonstrate that OMLA can lead to better adaptation performances compared to the baseline methods. The project link: https://ricky-zhu.github.io/OMLA/.

Efficient Continual Adaptation of Pretrained Robotic Policy with Online Meta-Learned Adapters

TL;DR

The paper tackles the problem of continual adaptation for pretrained robotic policies by enabling cross-task knowledge transfer through Online Meta-Learned Adapters (OMLA). It combines parameter-efficient adapters (LoRA) with an online meta-learning objective that learns adapter priors from previously seen tasks, using a memory-efficient, anchor-based data sampling strategy. A two-stage process first learns these priors online, then fine-tunes adapters for new tasks, enabling improved Forward Transfer with no Backward Transfer on diverse tasks and a real robot. Empirical results on LIBERO benchmarks and real Kinova experiments show robust improvements in adaptation performance, with the learned representations becoming more structured after online meta-learning, validating cross-task transfer.

Abstract

Continual adaptation is essential for general autonomous agents. For example, a household robot pretrained with a repertoire of skills must still adapt to unseen tasks specific to each household. Motivated by this, building upon parameter-efficient fine-tuning in language models, prior works have explored lightweight adapters to adapt pretrained policies, which can preserve learned features from the pretraining phase and demonstrate good adaptation performances. However, these approaches treat task learning separately, limiting knowledge transfer between tasks. In this paper, we propose Online Meta-Learned adapters (OMLA). Instead of applying adapters directly, OMLA can facilitate knowledge transfer from previously learned tasks to current learning tasks through a novel meta-learning objective. Extensive experiments in both simulated and real-world environments demonstrate that OMLA can lead to better adaptation performances compared to the baseline methods. The project link: https://ricky-zhu.github.io/OMLA/.

Paper Structure

This paper contains 18 sections, 4 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Our motivation: instead of learning adapters for each task separately, we develop OMLA for transfering the knowledge from previously learned tasks to facilitate new task adaptation.
  • Figure 2: The vision-language policy architecture used in our experiments. The inputs consist of the task description, image observations and proprioceptive states. A history of observations spanning $C$ time steps is used. Following that, an action prediction token [ACT] is prepended to the token list before being processed by the temporal transformer. The output of the temporal transformer corresponding to the token [ACT] serves as the input to the policy head for action prediction.
  • Figure 3: Illustration of the construction of meta-train data and meta-validation data. Instead of using one whole trajectory as the training dataset, which is computationally expensive, we sample the data points in the meta-train episode that are relevant to the data points in the meta-validation episode.
  • Figure 4: Examples of tasks of LIBERO (top to down: LIBERO-OBJECT, LIBERO-SPATIAL, LIBERO-GOAL).
  • Figure 5: (a) The real robot experiment setup. (b) Real robot task examples.
  • ...and 1 more figures