aerial-autonomy-stack -- a Faster-than-real-time, Autopilot-agnostic, ROS2 Framework to Simulate and Deploy Perception-based Drones

Jacopo Panerati; Sina Sajjadi; Sina Soleymanpour; Varunkumar Mehta; Iraj Mantegh

aerial-autonomy-stack -- a Faster-than-real-time, Autopilot-agnostic, ROS2 Framework to Simulate and Deploy Perception-based Drones

Jacopo Panerati, Sina Sajjadi, Sina Soleymanpour, Varunkumar Mehta, Iraj Mantegh

TL;DR

Aerial-autonomy-stack delivers an open-source, ROS2-based framework that unifies perception-based drone autonomy across PX4 and ArduPilot with a high-performance, containerized SITL/HITL platform. It combines Gazebo-based software-in-the-loop simulation, autopilot bridges, middleware (ROS2/DDS/MAVROS/Zenoh), and perception pipelines (YOLO/ONNX/KISS-ICP) to enable fast, end-to-end development and deployment. Key contributions include autopilot-agnostic ROS2 actions, networked multi-vehicle simulation, vertical CI/CD validation, and Jetson-in-the-loop capabilities, reducing sim-to-field friction and accelerating iteration cycles. The design supports over-the-air edge deployment, scalable multi-agent testing, and robust perception-to-control validation, advancing practical autonomy for drones in varied environments.

Abstract

Unmanned aerial vehicles are rapidly transforming multiple applications, from agricultural and infrastructure monitoring to logistics and defense. Introducing greater autonomy to these systems can simultaneously make them more effective as well as reliable. Thus, the ability to rapidly engineer and deploy autonomous aerial systems has become of strategic importance. In the 2010s, a combination of high-performance compute, data, and open-source software led to the current deep learning and AI boom, unlocking decades of prior theoretical work. Robotics is on the cusp of a similar transformation. However, physical AI faces unique hurdles, often combined under the umbrella term "simulation-to-reality gap". These span from modeling shortcomings to the complexity of vertically integrating the highly heterogeneous hardware and software systems typically found in field robots. To address the latter, we introduce aerial-autonomy-stack, an open-source, end-to-end framework designed to streamline the pipeline from (GPU-accelerated) perception to (flight controller-based) action. Our stack allows the development of aerial autonomy using ROS2 and provides a common interface for two of the most popular autopilots: PX4 and ArduPilot. We show that it supports over 20x faster-than-real-time, end-to-end simulation of a complete development and deployment stack -- including edge compute and networking -- significantly compressing the build-test-release cycle of perception-based autonomy.

aerial-autonomy-stack -- a Faster-than-real-time, Autopilot-agnostic, ROS2 Framework to Simulate and Deploy Perception-based Drones

TL;DR

Abstract

Paper Structure (45 sections, 6 figures, 2 tables)

This paper contains 45 sections, 6 figures, 2 tables.

Introduction
Related Work
Rationale
Aerial Autonomy Stack
Third-party Integrations
Simulator
Autopilot Firmware
PX4-Autopilot
ArduPilot and ardupilot-gazebo
Middleware
ROS2
DDS
MAVROS
GStreamer
Zenoh
...and 30 more sections

Figures (6)

Figure 1: Gazebo Sim scene rendering (right), FPV camera view with YOLO overlays (bottom left), and RViz LiDAR point cloud visualization (top left).
Figure 2: Block diagram of the software-in-the-loop simulation architecture comprising of the simulation-image, ground-image, and aircraft-image containers.
Figure 3: Vehicle models: Holybro X500v2 (top left); 3DR Iris (top right); Standard VTOL (bottom left); and ALTI Transition (bottom right).
Figure 4: 3D world models: Plain (top left); Empty (top right); City (bottom left); and Mountain (bottom right).
Figure 5: Block diagram of the hardware-in-the-loop simulation architecture for two vehicles, with the simulation-image and ground-image containers running on separate amd64 hosts, and two aircraft-image containers on two dedicated NVIDIA Jetson Orin. The SIM_SUBNET is created with a router over a wired Ethernet network while the AIR_SUBNET is created over a wireless mobile ad-hoc network.
...and 1 more figures

aerial-autonomy-stack -- a Faster-than-real-time, Autopilot-agnostic, ROS2 Framework to Simulate and Deploy Perception-based Drones

TL;DR

Abstract

aerial-autonomy-stack -- a Faster-than-real-time, Autopilot-agnostic, ROS2 Framework to Simulate and Deploy Perception-based Drones

Authors

TL;DR

Abstract

Table of Contents

Figures (6)