Table of Contents
Fetching ...

SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks

Yijie Guo, Bingjie Tang, Iretiayo Akinola, Dieter Fox, Abhishek Gupta, Yashraj Narang

TL;DR

SRSA addresses data-efficient learning for contact-rich robotic assembly by retrieving a relevant prior skill using a transfer-success predictor and then fine-tuning that skill on a new task. It jointly encodes task geometry, dynamics, and expert actions into embeddings to predict transfer success, guiding skill retrieval, and employs PPO with self-imitation to stabilize adaptation. The approach yields a 19% relative improvement in success rate over baselines, 2.6x lower outcome variance, and 2.4x fewer transition samples in simulation, with zero-shot sim-to-real transfer achieving around 90% mean success in real-world tasks. Additionally, SRSA supports continual learning by expanding the skill library to improve efficiency across new tasks.

Abstract

Enabling robots to learn novel tasks in a data-efficient manner is a long-standing challenge. Common strategies involve carefully leveraging prior experiences, especially transition data collected on related tasks. Although much progress has been made for general pick-and-place manipulation, far fewer studies have investigated contact-rich assembly tasks, where precise control is essential. We introduce SRSA (Skill Retrieval and Skill Adaptation), a novel framework designed to address this problem by utilizing a pre-existing skill library containing policies for diverse assembly tasks. The challenge lies in identifying which skill from the library is most relevant for fine-tuning on a new task. Our key hypothesis is that skills showing higher zero-shot success rates on a new task are better suited for rapid and effective fine-tuning on that task. To this end, we propose to predict the transfer success for all skills in the skill library on a novel task, and then use this prediction to guide the skill retrieval process. We establish a framework that jointly captures features of object geometry, physical dynamics, and expert actions to represent the tasks, allowing us to efficiently learn the transfer success predictor. Extensive experiments demonstrate that SRSA significantly outperforms the leading baseline. When retrieving and fine-tuning skills on unseen tasks, SRSA achieves a 19% relative improvement in success rate, exhibits 2.6x lower standard deviation across random seeds, and requires 2.4x fewer transition samples to reach a satisfactory success rate, compared to the baseline. Furthermore, policies trained with SRSA in simulation achieve a 90% mean success rate when deployed in the real world. Please visit our project webpage https://srsa2024.github.io/.

SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks

TL;DR

SRSA addresses data-efficient learning for contact-rich robotic assembly by retrieving a relevant prior skill using a transfer-success predictor and then fine-tuning that skill on a new task. It jointly encodes task geometry, dynamics, and expert actions into embeddings to predict transfer success, guiding skill retrieval, and employs PPO with self-imitation to stabilize adaptation. The approach yields a 19% relative improvement in success rate over baselines, 2.6x lower outcome variance, and 2.4x fewer transition samples in simulation, with zero-shot sim-to-real transfer achieving around 90% mean success in real-world tasks. Additionally, SRSA supports continual learning by expanding the skill library to improve efficiency across new tasks.

Abstract

Enabling robots to learn novel tasks in a data-efficient manner is a long-standing challenge. Common strategies involve carefully leveraging prior experiences, especially transition data collected on related tasks. Although much progress has been made for general pick-and-place manipulation, far fewer studies have investigated contact-rich assembly tasks, where precise control is essential. We introduce SRSA (Skill Retrieval and Skill Adaptation), a novel framework designed to address this problem by utilizing a pre-existing skill library containing policies for diverse assembly tasks. The challenge lies in identifying which skill from the library is most relevant for fine-tuning on a new task. Our key hypothesis is that skills showing higher zero-shot success rates on a new task are better suited for rapid and effective fine-tuning on that task. To this end, we propose to predict the transfer success for all skills in the skill library on a novel task, and then use this prediction to guide the skill retrieval process. We establish a framework that jointly captures features of object geometry, physical dynamics, and expert actions to represent the tasks, allowing us to efficiently learn the transfer success predictor. Extensive experiments demonstrate that SRSA significantly outperforms the leading baseline. When retrieving and fine-tuning skills on unseen tasks, SRSA achieves a 19% relative improvement in success rate, exhibits 2.6x lower standard deviation across random seeds, and requires 2.4x fewer transition samples to reach a satisfactory success rate, compared to the baseline. Furthermore, policies trained with SRSA in simulation achieve a 90% mean success rate when deployed in the real world. Please visit our project webpage https://srsa2024.github.io/.

Paper Structure

This paper contains 46 sections, 1 theorem, 6 equations, 19 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

Let ${T_i=\{\mathcal{S}, \mathcal{A}, p_i, r, \gamma, \rho_i\}}$ and ${T_j=\{\mathcal{S}, \mathcal{A}, p_j, r, \gamma, \rho_j\}}$ be two MDPs in the task space $\mathcal{T}$. Applying a policy $\pi$ on $T_i$ and $T_j$, we have a function $f$ to describe the value difference:

Figures (19)

  • Figure 1: Overview of SRSA. We address assembly tasks, where the goal is to use a robot arm to insert diverse plugs (i.e., the white parts) into or onto corresponding sockets (i.e., the green parts). Specifically, we propose to predict the transfer success of applying prior skills (i.e., policies) to a new task, retrieve the skill with the highest predicted success rate, and fine-tune it on the new task. During fine-tuning, we accelerate and stabilize adaptation by incorporating imitation learning of high-rewarding transitions from the agent's own replay buffer.
  • Figure 2: Illustration of assembly tasks in AutoMate and SRSA. (a) Samples of assembly tasks in the AutoMate benchmark. (b) 3D-printed parts of corresponding real-world assembly tasks in SRSA. (c) Keyframes from video recordings of our real-world deployments of performant policies.
  • Figure 3: Illustration of skill retrieval approach. We decompose skill retrieval into task feature learning(a-c) and transfer success prediction(d). (a) Geometry features are learned from point-cloud input using a PointNet autoencoder. (b) Dynamics features are learned from transition segments using a state-prediction objective. (c) Expert-action features are learned from transition segments using an action-reconstruction objective. (d) The zero-shot transfer success rate (of applying a source policy to a target task) is predicted using these task features from the source and target tasks.
  • Figure 4: Zero-shot transfer success of retrieved skills when applied to test tasks. For each test task, we retrieve a policy from the prior skill library using 5 different approaches (4 baselines and SRSA). If the approach involves training neural networks, we train on 3 random seeds. Optimal represents the best transfer success rate on the target task among all source policies. Left: Mean and standard deviation of transfer success rate, averaged over 10 test tasks with 3 seeds each. Right: Mean and standard deviation of success rate for each test task, averaged over 3 seeds. Overall, SRSA substantially outperforms baselines.
  • Figure 5: Learning curves on test tasks. The $x$-axis represents training epochs, where each epoch consists of 128 environment steps over 256 parallel environments. The $y$-axis represents success rate. The solid line shows the mean success rate over 5 runs with different random seeds, and the shaded area denotes the standard deviation.
  • ...and 14 more figures

Theorems & Definitions (2)

  • Proposition 1
  • proof