MonoSpheres: Large-Scale Monocular SLAM-Based UAV Exploration through Perception-Coupled Mapping and Planning
Tomáš Musil, Matěj Petrlík, Martin Saska
TL;DR
This work tackles monocular UAV exploration in unknown 3D environments by introducing a perception-coupled mapping and planning framework. It constructs a sphere-based map from sparse monocular SLAM points, augmented with depth interpolation (OVDE) and obstacle filtering (DBOF) to robustly represent free space and obstacles without dense sensors or GPUs. Frontiers are sampled directly on the free-space polyhedron, and exploration viewpoints are chosen to ensure sufficient translational motion for parallax-based depth estimation, improving safety and scalability. Extensive real-world and simulated experiments, including ablations, demonstrate large-scale indoor and outdoor exploration capabilities, supported by open-source code for future research.
Abstract
Autonomous exploration of unknown environments is a key capability for mobile robots, but it is largely unsolved for robots equipped with only a single monocular camera and no dense range sensors. In this paper, we present a novel approach to monocular vision-based exploration that can safely cover large-scale unstructured indoor and outdoor 3D environments by explicitly accounting for the properties of a sparse monocular SLAM frontend in both mapping and planning. The mapping module solves the problems of sparse depth data, free-space gaps, and large depth uncertainty by oversampling free space in texture-sparse areas and keeping track of obstacle position uncertainty. The planning module handles the added free-space uncertainty through rapid replanning and perception-aware heading control. We further show that frontier-based exploration is possible with sparse monocular depth data when parallax requirements and the possibility of textureless surfaces are taken into account. We evaluate our approach extensively in diverse real-world and simulated environments, including ablation studies. To the best of the authors' knowledge, the proposed method is the first to achieve 3D monocular exploration in real-world unstructured outdoor environments. We open-source our implementation to support future research.
