Table of Contents
Fetching ...

Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor

Anish Bhattacharya, Marco Cannici, Nishanth Rao, Yuezhan Tao, Vijay Kumar, Nikolai Matni, Davide Scaramuzza

TL;DR

This work can pre-train a reactive obstacle avoidance events-to-control policy with approximated, simulated events and then fine-tune the perception component with limited events-and-depth real-world data to achieve obstacle avoidance in indoor and outdoor settings.

Abstract

We present the first static-obstacle avoidance method for quadrotors using just an onboard, monocular event camera. Quadrotors are capable of fast and agile flight in cluttered environments when piloted manually, but vision-based autonomous flight in unknown environments is difficult in part due to the sensor limitations of traditional onboard cameras. Event cameras, however, promise nearly zero motion blur and high dynamic range, but produce a very large volume of events under significant ego-motion and further lack a continuous-time sensor model in simulation, making direct sim-to-real transfer not possible. By leveraging depth prediction as a pretext task in our learning framework, we can pre-train a reactive obstacle avoidance events-to-control policy with approximated, simulated events and then fine-tune the perception component with limited events-and-depth real-world data to achieve obstacle avoidance in indoor and outdoor settings. We demonstrate this across two quadrotor-event camera platforms in multiple settings and find, contrary to traditional vision-based works, that low speeds (1m/s) make the task harder and more prone to collisions, while high speeds (5m/s) result in better event-based depth estimation and avoidance. We also find that success rates in outdoor scenes can be significantly higher than in certain indoor scenes.

Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor

TL;DR

This work can pre-train a reactive obstacle avoidance events-to-control policy with approximated, simulated events and then fine-tune the perception component with limited events-and-depth real-world data to achieve obstacle avoidance in indoor and outdoor settings.

Abstract

We present the first static-obstacle avoidance method for quadrotors using just an onboard, monocular event camera. Quadrotors are capable of fast and agile flight in cluttered environments when piloted manually, but vision-based autonomous flight in unknown environments is difficult in part due to the sensor limitations of traditional onboard cameras. Event cameras, however, promise nearly zero motion blur and high dynamic range, but produce a very large volume of events under significant ego-motion and further lack a continuous-time sensor model in simulation, making direct sim-to-real transfer not possible. By leveraging depth prediction as a pretext task in our learning framework, we can pre-train a reactive obstacle avoidance events-to-control policy with approximated, simulated events and then fine-tune the perception component with limited events-and-depth real-world data to achieve obstacle avoidance in indoor and outdoor settings. We demonstrate this across two quadrotor-event camera platforms in multiple settings and find, contrary to traditional vision-based works, that low speeds (1m/s) make the task harder and more prone to collisions, while high speeds (5m/s) result in better event-based depth estimation and avoidance. We also find that success rates in outdoor scenes can be significantly higher than in certain indoor scenes.

Paper Structure

This paper contains 11 sections, 3 equations, 6 figures.

Figures (6)

  • Figure 1: Our event camera-equipped quadrotor avoids static obstacles under considerable ego-motion. Our simulation pre-trained, events-to-control policy is fine-tuned with real-world perception data, such as that from a forest. We demonstrate obstacle avoidance in indoor, outdoor, and dark environments.
  • Figure 2: A portion of the continuous event stream is shown from the forest trial in Figure \ref{['fig:forest-view']} (showing 5% of the true event stream density for viewability), with sample 33ms-batches and the corresponding model predictions.
  • Figure 3: Data generation and learning framework. Grayscale images and depths are collected from quadrotor flights in Flightmare flightmare. Vid2E gehrig2020video processes the images to generate an event stream, which is batched and converted to binary event masks ($\text{BEMs}$) representations for input to the depth predictor $D(\theta)$. The perception loss $\mathcal{L}_p$ is computed relative to the ground truth depth. Ground truth obstacle states inform the expert policy, which logs desired velocities for supervising the velocity predictor $V(\phi)$. Multi-dimensional quantities are in bold.
  • Figure 4: Simulation trials conducted in Flightmare, where approximated events (top left) are used to produce predicted depth and velocity commands. Execution success rates presented across ablations of our training method (lower left) show that jointly training the perception $D(\theta)$ and velocity $V(\phi)$ modules is beneficial. Note that the low observed simulation success rates result from the out-of-distribution event stream approximation (Equation \ref{['eq:difflog-evs-approx']}) that must be resorted to during real-time execution due to the lack of a continuous-time event camera simulator.
  • Figure 5: (top left) The two quadrotor platforms used for indoor and outdoor experiments with different event cameras. The Falcon250 tao2023seer additionally has a VOXL board voxlDatasheet for state estimation. (lower left) An event-volume (10% true density) produced by the continuous event stream from the in-the-dark trial, with select event batches, corresponding BEMs, and network predictions shown. Note the high density of negative (blue) events and positive (red) events when the lights turn off and on, respectively. (right) A variety of real experiments in indoor and outdoor conditions. Our event camera-equipped quadrotor can avoid obstacles in the dark, where traditional cameras would fail.
  • ...and 1 more figures