Table of Contents
Fetching ...

Self-supervised Domain Adaptation for Visual 3D Pose Estimation of Nano-drone Racing Gates by Enforcing Geometric Consistency

Nicholas Carlotti, Michele Antonazzi, Elia Cereda, Mirko Nava, Nicola Basilico, Daniele Palossi, Alessandro Giusti

TL;DR

This work considers the task of visually estimating the relative pose of a drone racing gate in front of a nano-quadrotor, using a convolutional neural network pre-trained on simulated data to regress the gate's pose, and proposes an unsupervised domain adaptation (UDA) approach that outperforms other SoA UDA approaches.

Abstract

We consider the task of visually estimating the relative pose of a drone racing gate in front of a nano-quadrotor, using a convolutional neural network pre-trained on simulated data to regress the gate's pose. Due to the sim-to-real gap, the pre-trained model underperforms in the real world and must be adapted to the target domain. We propose an unsupervised domain adaptation (UDA) approach using only real image sequences collected by the drone flying an arbitrary trajectory in front of a gate; sequences are annotated in a self-supervised fashion with the drone's odometry as measured by its onboard sensors. On this dataset, a state consistency loss enforces that two images acquired at different times yield pose predictions that are consistent with the drone's odometry. Results indicate that our approach outperforms other SoA UDA approaches, has a low mean absolute error in position (x=26, y=28, z=10 cm) and orientation ($ψ$=13${^{\circ}}$), an improvement of 40% in position and 37% in orientation over a baseline. The approach's effectiveness is appreciable with as few as 10 minutes of real-world flight data and yields models with an inference time of 30.4ms (33 fps) when deployed aboard the Crazyflie 2.1 Brushless nano-drone.

Self-supervised Domain Adaptation for Visual 3D Pose Estimation of Nano-drone Racing Gates by Enforcing Geometric Consistency

TL;DR

This work considers the task of visually estimating the relative pose of a drone racing gate in front of a nano-quadrotor, using a convolutional neural network pre-trained on simulated data to regress the gate's pose, and proposes an unsupervised domain adaptation (UDA) approach that outperforms other SoA UDA approaches.

Abstract

We consider the task of visually estimating the relative pose of a drone racing gate in front of a nano-quadrotor, using a convolutional neural network pre-trained on simulated data to regress the gate's pose. Due to the sim-to-real gap, the pre-trained model underperforms in the real world and must be adapted to the target domain. We propose an unsupervised domain adaptation (UDA) approach using only real image sequences collected by the drone flying an arbitrary trajectory in front of a gate; sequences are annotated in a self-supervised fashion with the drone's odometry as measured by its onboard sensors. On this dataset, a state consistency loss enforces that two images acquired at different times yield pose predictions that are consistent with the drone's odometry. Results indicate that our approach outperforms other SoA UDA approaches, has a low mean absolute error in position (x=26, y=28, z=10 cm) and orientation (=13), an improvement of 40% in position and 37% in orientation over a baseline. The approach's effectiveness is appreciable with as few as 10 minutes of real-world flight data and yields models with an inference time of 30.4ms (33 fps) when deployed aboard the Crazyflie 2.1 Brushless nano-drone.
Paper Structure (16 sections, 2 equations, 7 figures, 1 table)

This paper contains 16 sections, 2 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: We estimate the pose ($x$, $y$, $z$, $\psi$) of drone racing gates by transferring a model trained in simulation to a real-world environment and deploying it aboard the Bitcraze Crazyflie 2.1 Brushless nano-UAV.
  • Figure 2: Assuming perfect odometry (left), the state consistency loss forces the relative gate poses $\mathcal{\hat{P}}_\text{G}^1$ and $\mathcal{\hat{P}}_\text{G}^2$, predicted from the drone's poses at $\mathcal{P}_\text{R}^{1}$ and $\mathcal{P}_\text{R}^{2}$, to be coherent with the drone's relative odometry $\mathcal{O}^{1\rightarrow 2}$ between the two poses. In our experiments, the measured odometry $\mathcal{\hat{O}}^{1\rightarrow 2}$ is collected by the drone itself and is affected by drift and noise (right).
  • Figure 3: The Crazyflie 2.1 Brushless nano-drone used in our experiments.
  • Figure 4: Examples of (left) simulated images from $\mathcal{D}^\text{train}_\text{sim}$, (middle) real-world robot images in $\mathcal{D}^\text{train}_\text{real}$, and (right) after applying the pencil filter pham2022pencilnet.
  • Figure 5: Our self-supervised domain adaptation approach vs ground truth on $\mathcal{D}^\text{test}_\text{real}$ for the components of the gate pose ($x, y, z, \psi$).
  • ...and 2 more figures