Table of Contents
Fetching ...

Self-Improving Autonomous Underwater Manipulation

Ruoshi Liu, Huy Ha, Mengxue Hou, Shuran Song, Carl Vondrick

Abstract

Underwater robotic manipulation faces significant challenges due to complex fluid dynamics and unstructured environments, causing most manipulation systems to rely heavily on human teleoperation. In this paper, we introduce AquaBot, a fully autonomous manipulation system that combines behavior cloning from human demonstrations with self-learning optimization to improve beyond human teleoperation performance. With extensive real-world experiments, we demonstrate AquaBot's versatility across diverse manipulation tasks, including object grasping, trash sorting, and rescue retrieval. Our real-world experiments show that AquaBot's self-optimized policy outperforms a human operator by 41% in speed. AquaBot represents a promising step towards autonomous and self-improving underwater manipulation systems. We open-source both hardware and software implementation details.

Self-Improving Autonomous Underwater Manipulation

Abstract

Underwater robotic manipulation faces significant challenges due to complex fluid dynamics and unstructured environments, causing most manipulation systems to rely heavily on human teleoperation. In this paper, we introduce AquaBot, a fully autonomous manipulation system that combines behavior cloning from human demonstrations with self-learning optimization to improve beyond human teleoperation performance. With extensive real-world experiments, we demonstrate AquaBot's versatility across diverse manipulation tasks, including object grasping, trash sorting, and rescue retrieval. Our real-world experiments show that AquaBot's self-optimized policy outperforms a human operator by 41% in speed. AquaBot represents a promising step towards autonomous and self-improving underwater manipulation systems. We open-source both hardware and software implementation details.

Paper Structure

This paper contains 16 sections, 1 equation, 6 figures, 5 tables.

Figures (6)

  • Figure 1: AquaBot combines behavior cloning with self-learning to optimize fully autonomous end-to-end visuomotor policies to achieve efficient manipulation skills across a wide range of tasks, including generalization to unseen objects (Rock Grasping), long horizon tasks (Trash Sorting) and robustness against large perturbations from unmodelled deformable and articulated objects (Rescue Retrieval).
  • Figure 2: Our accessible hardware platform ($2000 USD) consists of 2 cameras and a parallel jaw gripper for research and development of underwater visuomotor policy learning.
  • Figure 3: Learning Framework. In the first stage (a), we train our base policy by learning from human demonstrations from offline data. In the second stage (b), we roll out the behavior-cloned policy to collect more self-learning data to learn a surrogate model in an online fashion, which optimizes the motor speed $\delta$ in an online fashion.
  • Figure 4: Dynamic and Robust Manipulation. We plot policy outputs below their corresponding third-person views. (a) shows how the robot decelerates by applying backward forces when it is close to the object, demonstrating proficiency in underwater dynamics. (b) shows how the policy will retry after unstable grasps, demonstrating robustness.
  • Figure 5: By self-learning (SL), AquaBot learns to accelerate a manipulation policy learned from Behavior Cloning (BC) through trial-and-error. After only 100 iterations, it can perform the same manipulation task more efficiently than vanilla BC policy and human experts.
  • ...and 1 more figures