Table of Contents
Fetching ...

A Miniature Vision-Based Localization System for Indoor Blimps

Shicong Ma

TL;DR

The paper tackles autonomous indoor localization for miniature blimps using a map-based, vision-driven approach. It builds a sparse 3D model offline via Structure from Motion with SuperPoint features, then performs continuous camera Pose Estimation against this map through 2D-3D data associations and a factor-graph-based nonlinear optimization. The main findings show that SuperPoint-based localization is more accurate and robust to viewpoint changes and illumination than traditional descriptors, with reasonable performance on a ground-station setup; however, the system is challenged by fast-motion and motion blur. This work demonstrates a practical path toward autonomous indoor blimps using a lightweight perception stack on a ground station, with clear avenues for real-time optimization and deployment on actual blimp hardware.

Abstract

With increasing attention paid to blimp research, I hope to build an indoor blimp to interact with humans. To begin with, I propose developing a visual localization system to enable blimps to localize themselves in an indoor environment autonomously. This system initially reconstructs an indoor environment by employing Structure from Motion with Superpoint visual features. Next, with the previously built sparse point cloud map, the system generates camera poses by continuously employing pose estimation on matched visual features observed from the map. In this project, the blimp only serves as a reference mobile platform that constrains the weight of the perception system. The perception system contains one monocular camera and a WiFi adaptor to capture and transmit visual data to a ground PC station where the algorithms will be executed. The success of this project will transform remote-controlled indoor blimps into autonomous indoor blimps, which can be utilized for applications such as surveillance, advertisement, and indoor mapping.

A Miniature Vision-Based Localization System for Indoor Blimps

TL;DR

The paper tackles autonomous indoor localization for miniature blimps using a map-based, vision-driven approach. It builds a sparse 3D model offline via Structure from Motion with SuperPoint features, then performs continuous camera Pose Estimation against this map through 2D-3D data associations and a factor-graph-based nonlinear optimization. The main findings show that SuperPoint-based localization is more accurate and robust to viewpoint changes and illumination than traditional descriptors, with reasonable performance on a ground-station setup; however, the system is challenged by fast-motion and motion blur. This work demonstrates a practical path toward autonomous indoor blimps using a lightweight perception stack on a ground station, with clear avenues for real-time optimization and deployment on actual blimp hardware.

Abstract

With increasing attention paid to blimp research, I hope to build an indoor blimp to interact with humans. To begin with, I propose developing a visual localization system to enable blimps to localize themselves in an indoor environment autonomously. This system initially reconstructs an indoor environment by employing Structure from Motion with Superpoint visual features. Next, with the previously built sparse point cloud map, the system generates camera poses by continuously employing pose estimation on matched visual features observed from the map. In this project, the blimp only serves as a reference mobile platform that constrains the weight of the perception system. The perception system contains one monocular camera and a WiFi adaptor to capture and transmit visual data to a ground PC station where the algorithms will be executed. The success of this project will transform remote-controlled indoor blimps into autonomous indoor blimps, which can be utilized for applications such as surveillance, advertisement, and indoor mapping.
Paper Structure (57 sections, 18 equations, 23 figures, 5 tables, 1 algorithm)

This paper contains 57 sections, 18 equations, 23 figures, 5 tables, 1 algorithm.

Figures (23)

  • Figure 1: Visualization of vision-based localization related work structure
  • Figure 2: This figure is an overview of my localization pipeline.
  • Figure 3: Visualization of applying the camera extrinsics will transform a 3D point from global coordinates into the camera coordinate system. Image adopt from garsten20206dof
  • Figure 4: Applying the calibration matrix K to a point in camera coordinates will reproject it into the image as a pixel coordinate. Image adopt from garsten20206dof
  • Figure 5: Pose Estimation Factor Graph. The Factor Graph contains five prior factors, five landmarks $\ell_i$, and one camera pose $x_k$. The estimation of $x_k$ is the pose of the previous state. The generic projection factors represented by the black nodes are the 2D to 3D correspondences.
  • ...and 18 more figures