ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

Naoki Yokoyama; Alex Clegg; Joanne Truong; Eric Undersander; Tsung-Yen Yang; Sergio Arnaud; Sehoon Ha; Dhruv Batra; Akshara Rai

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

Naoki Yokoyama, Alex Clegg, Joanne Truong, Eric Undersander, Tsung-Yen Yang, Sergio Arnaud, Sehoon Ha, Dhruv Batra, Akshara Rai

TL;DR

Adaptive Skill Coordination (ASC) tackles long-horizon mobile manipulation by learning a set of basic visuomotor skills, a coordination policy to sequence them, and a corrective policy that adapts in out-of-distribution states. Trained entirely in simulation, ASC transfers zero-shot to the Boston Dynamics Spot robot and demonstrates robust real-world performance across eight environments without detailed maps or precise object locations. Key findings show that coordination plus correction markedly reduce hand-off failures and improve resilience to dynamic obstacles and disturbances, outperforming baselines and BD-provided APIs in long-range navigation and occluded-object grasping. The work highlights the practical viability of sim-to-real learned components for real-world, vision-based robotic manipulation with minimal environment knowledge.

Abstract

We present Adaptive Skill Coordination (ASC) -- an approach for accomplishing long-horizon tasks like mobile pick-and-place (i.e., navigating to an object, picking it, navigating to another location, and placing it). ASC consists of three components -- (1) a library of basic visuomotor skills (navigation, pick, place), (2) a skill coordination policy that chooses which skill to use when, and (3) a corrective policy that adapts pre-trained skills in out-of-distribution states. All components of ASC rely only on onboard visual and proprioceptive sensing, without requiring detailed maps with obstacle layouts or precise object locations, easing real-world deployment. We train ASC in simulated indoor environments, and deploy it zero-shot (without any real-world experience or fine-tuning) on the Boston Dynamics Spot robot in eight novel real-world environments (one apartment, one lab, two microkitchens, two lounges, one office space, one outdoor courtyard). In rigorous quantitative comparisons in two environments, ASC achieves near-perfect performance (59/60 episodes, or 98%), while sequentially executing skills succeeds in only 44/60 (73%) episodes. Extensive perturbation experiments show that ASC is robust to hand-off errors, changes in the environment layout, dynamic obstacles (e.g., people), and unexpected disturbances. Supplementary videos at adaptiveskillcoordination.github.io.

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

TL;DR

Abstract

Paper Structure (23 sections, 8 equations, 10 figures, 2 tables)

This paper contains 23 sections, 8 equations, 10 figures, 2 tables.

Introduction
Related Works
Task: Mobile pick-and-place
Adaptive Skill Coordination
Basic visuomotor skills
Skill coordination and correction
Training Details
Experimental Evaluation
Quantitative experiments in the real world
Mobile pick-and-place in simulation
Robustness to perturbations
Comparing ASC skills and the Boston Dynamics API
Limitations and Conclusion
Appendix
Real-world and simulated environments and observations
...and 8 more sections

Figures (10)

Figure 1: Adaptive Skill Coordination (ASC) is deployed on Spot in a novel environment and tasked with mobile pick-and-place, using learned sensor-to-action skills. The robot starts at its dock (red, A), navigates to a pick receptacle (green, B, D, F), searches for and picks an object, navigates to a place receptacle (blue, C, E, G), and places the object at its desired place location, and repeats.
Figure 2: The robot navigates to a receptacle at $(x, y, \theta)_{pick}$, searches for and picks a target object, navigates to the place receptacle, located at $(x, y, \theta)_{place}$, and places the object at the target place location $(x, y, z)_{tgt}$. Precise object locations and a detailed map of the environment with obstacles are not given.
Figure 4: Training ASC consists of two steps: (left) First, we train a library of 3 basic visuomotor skills in diverse simulated environments. The skills are trained using RL to achieve the relatively shorter-horizon tasks of navigation, picking and placing, and command robot base velocities and joint position deltas. (right) Next, we train a skill coordination policy that chooses which skills are appropriate to use based on observations, and a corrective policy that adapts the pre-trained skills in out-of-distribution states, for the task of mobile pick-and-place.
Figure 5: We present quantitative experiments in an Apartment and a Lab, 2 of the 8 unseen real-world environments ASC is deployed in.
Figure 6: ASC is deployed in eight real environments (including Apartment and Lab), showing in-the-wild mobile pick-and-place capabilities.
...and 5 more figures

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

TL;DR

Abstract

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)