Table of Contents
Fetching ...

An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Erik Bauer, Marc Blöchlinger, Pascal Strauch, Arman Raayatsanati, Curdin Cavelti, Robert K. Katzschmann

TL;DR

The paper tackles enabling autonomous aerial manipulation without external perception by introducing an open-source soft-robotic platform that relies solely on onboard SLAM for self-localization and a learning-based RGB-D segmentation pipeline for target localization. It presents a modular hardware/software stack (ROS 2, PX4) with a Fin Ray-inspired gripper and a custom power-management board, validated through flight and grasping experiments. The core contributions include zero-shot grasping across indoor/outdoor settings and a robust onboard perception pipeline, achieving an average grasp success of 85% over 144 attempts. By releasing hardware and software openly, the work lowers barriers to deploying and extending autonomous aerial manipulation in unstructured environments.

Abstract

Aerial manipulation combines the versatility and speed of flying platforms with the functional capabilities of mobile manipulation, which presents significant challenges due to the need for precise localization and control. Traditionally, researchers have relied on offboard perception systems, which are limited to expensive and impractical specially equipped indoor environments. In this work, we introduce a novel platform for autonomous aerial manipulation that exclusively utilizes onboard perception systems. Our platform can perform aerial manipulation in various indoor and outdoor environments without depending on external perception systems. Our experimental results demonstrate the platform's ability to autonomously grasp various objects in diverse settings. This advancement significantly improves the scalability and practicality of aerial manipulation applications by eliminating the need for costly tracking solutions. To accelerate future research, we open source our ROS 2 software stack and custom hardware design, making our contributions accessible to the broader research community.

An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

TL;DR

The paper tackles enabling autonomous aerial manipulation without external perception by introducing an open-source soft-robotic platform that relies solely on onboard SLAM for self-localization and a learning-based RGB-D segmentation pipeline for target localization. It presents a modular hardware/software stack (ROS 2, PX4) with a Fin Ray-inspired gripper and a custom power-management board, validated through flight and grasping experiments. The core contributions include zero-shot grasping across indoor/outdoor settings and a robust onboard perception pipeline, achieving an average grasp success of 85% over 144 attempts. By releasing hardware and software openly, the work lowers barriers to deploying and extending autonomous aerial manipulation in unstructured environments.

Abstract

Aerial manipulation combines the versatility and speed of flying platforms with the functional capabilities of mobile manipulation, which presents significant challenges due to the need for precise localization and control. Traditionally, researchers have relied on offboard perception systems, which are limited to expensive and impractical specially equipped indoor environments. In this work, we introduce a novel platform for autonomous aerial manipulation that exclusively utilizes onboard perception systems. Our platform can perform aerial manipulation in various indoor and outdoor environments without depending on external perception systems. Our experimental results demonstrate the platform's ability to autonomously grasp various objects in diverse settings. This advancement significantly improves the scalability and practicality of aerial manipulation applications by eliminating the need for costly tracking solutions. To accelerate future research, we open source our ROS 2 software stack and custom hardware design, making our contributions accessible to the broader research community.
Paper Structure (20 sections, 8 figures, 2 tables)

This paper contains 20 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Our autonomous aerial platform demonstrates its aerial manipulation capabilities with onboard-only perception.
  • Figure 2: The overall pipeline of our aerial manipulation system. The visual perception module includes SLAM for estimating the system's pose and target localization to identify objects. The planning module encompasses mission control and trajectory planning based on visual estimates. Finally, the actuation module involves flight and gripper control to execute the planned trajectory and manipulate objects.
  • Figure 3: The component architecture with corresponding connections. The onboard components, including the OAK-D, Arduino Nano, and Pixhawk 6C, are connected to the Odroid H3+ via USB connections. The onboard and offboard computers maintain a WiFi connection for exchanging data between the modules. The offboard PC also establishes a radio connection to the Pixhawk 6C flight controller, allowing the offboard modules to communicate directly.
  • Figure 4: A high-level overview of the target localization pipeline. The target localization process starts at timestep $t=i$, with the operator selecting the target object in an RGB livestream of the onboard camera. We use SAM2 to segment and track the target object across subsequent frames. Each mask is fused with the corresponding depth image to produce a partial point cloud of the object, which is then used for grasp planning. The individual target point estimates are fused via 1-point RANSAC to compute a robust estimate of the target pose.
  • Figure 5: We evaluate the accuracy of the Spectacular AI SLAM system against motion capture ground truth during a sample grasping mission. All coordinates are expressed in NED (north-east-down) convention. The two trajectories have been aligned using Umeyama alignment with EVO grupp2017evo. We observe a peak RPE of 0.042 during a fast descent towards the object, whereas the mean RPE is 0.028.
  • ...and 3 more figures