Table of Contents
Fetching ...

Robust Robotic Exploration and Mapping Using Generative Occupancy Map Synthesis

Lorin Achey, Alec Reed, Brendan Crowe, Bradley Hayes, Christoffer Heckman

TL;DR

The paper tackles robust robotic exploration under occlusions by introducing SceneSense, a diffusion-based generative occupancy predictor that forecasts unseen geometry from partial observations and merges it probabilistically into a running map. It formalizes a dense-occupancy framework with forward and reverse diffusion, and implements an unconditional denoising network that operates alongside OctoMap, occupancy inpainting, and multi-prediction merging to produce coherent, traversable maps. Empirical results on a quadruped platform show substantial improvements in map fidelity and exploration robustness, including notable FID reductions and better traversal in challenging scenarios such as startup holes and narrow hallways. The work demonstrates that SceneSense can function as a drop-in enhancement to existing planning stacks, enabling more consistent and efficient exploration, with practical implications for real-world robotic deployments and future planning-system integration improvements.

Abstract

We present a novel approach for enhancing robotic exploration by using generative occupancy mapping. We implement SceneSense, a diffusion model designed and trained for predicting 3D occupancy maps given partial observations. Our proposed approach probabilistically fuses these predictions into a running occupancy map in real-time, resulting in significant improvements in map quality and traversability. We deploy SceneSense on a quadruped robot and validate its performance with real-world experiments to demonstrate the effectiveness of the model. In these experiments we show that occupancy maps enhanced with SceneSense predictions better estimate the distribution of our fully observed ground truth data ($24.44\%$ FID improvement around the robot and $75.59\%$ improvement at range). We additionally show that integrating SceneSense enhanced maps into our robotic exploration stack as a ``drop-in'' map improvement, utilizing an existing off-the-shelf planner, results in improvements in robustness and traversability time. Finally, we show results of full exploration evaluations with our proposed system in two dissimilar environments and find that locally enhanced maps provide more consistent exploration results than maps constructed only from direct sensor measurements.

Robust Robotic Exploration and Mapping Using Generative Occupancy Map Synthesis

TL;DR

The paper tackles robust robotic exploration under occlusions by introducing SceneSense, a diffusion-based generative occupancy predictor that forecasts unseen geometry from partial observations and merges it probabilistically into a running map. It formalizes a dense-occupancy framework with forward and reverse diffusion, and implements an unconditional denoising network that operates alongside OctoMap, occupancy inpainting, and multi-prediction merging to produce coherent, traversable maps. Empirical results on a quadruped platform show substantial improvements in map fidelity and exploration robustness, including notable FID reductions and better traversal in challenging scenarios such as startup holes and narrow hallways. The work demonstrates that SceneSense can function as a drop-in enhancement to existing planning stacks, enabling more consistent and efficient exploration, with practical implications for real-world robotic deployments and future planning-system integration improvements.

Abstract

We present a novel approach for enhancing robotic exploration by using generative occupancy mapping. We implement SceneSense, a diffusion model designed and trained for predicting 3D occupancy maps given partial observations. Our proposed approach probabilistically fuses these predictions into a running occupancy map in real-time, resulting in significant improvements in map quality and traversability. We deploy SceneSense on a quadruped robot and validate its performance with real-world experiments to demonstrate the effectiveness of the model. In these experiments we show that occupancy maps enhanced with SceneSense predictions better estimate the distribution of our fully observed ground truth data ( FID improvement around the robot and improvement at range). We additionally show that integrating SceneSense enhanced maps into our robotic exploration stack as a ``drop-in'' map improvement, utilizing an existing off-the-shelf planner, results in improvements in robustness and traversability time. Finally, we show results of full exploration evaluations with our proposed system in two dissimilar environments and find that locally enhanced maps provide more consistent exploration results than maps constructed only from direct sensor measurements.

Paper Structure

This paper contains 43 sections, 7 equations, 15 figures, 4 tables, 1 algorithm.

Figures (15)

  • Figure 1: The reverse diffusion process takes the local occupancy information and the Gaussian noise of the area to be diffused over. Noise commensurate with the current diffusion step is added to the local occupancy information, which includes occupied (green) and observed unoccupied (red) data. The result is inpainted into the noisy local occupancy prediction. The inpainted noise data is provided to the denoising network which generates a new noisy geometry prediction at $t-1$. This process is repeated as the starting noise $x_T$ is iteratively denoised to $x_0$ which is the final geometry prediction from the framework. This process is further detailed in Algorithm \ref{['alg:unconditional_diffusion']}.
  • Figure 2: Block diagram showing the system design for onboard SceneSense occupancy prediction. The system is comprised of an IMU and LIDAR sensor to generate odometry and occupancy maps. Once the occupancy map is built, a graph is constructed to evaluate frontier points for occupancy prediction. Local occupancy is then subselected around these points and sent to the SceneSense framework that provides occupancy predictions. These predictions are then merged with the running occupancy map using the probabilistic update rule.
  • Figure 3: Spot robotic platform with onboard compute and sensors. Sensor suite consists of a 64 Beam OS1 lidar as well a 3 FLIR GigE Cameras.
  • Figure 4: Multi-Prediction Occupancy Merging: SceneSense predicts various occupancy maps based on equivalent input data that form a distribution. This distribution forms a curve where more likely predictions occur more often, and less likely predictions occur infrequently. These predictions are fused into the merged map using Eq. \ref{['eq:prob_map_merging']}. The resulting merged map naturally filters out the unlikely voxel predictions, forming an extended occupancy map.
  • Figure 5: At startup, the robot cannot observe the ground directly under it due to the mounting location of the lidar. (b) SceneSense generates occupancy predictions (Red Voxels) that fill in the hole under the robot as well as some of the vertical occluded geometry. With the additional predictions, the robot autonomously generates a traversable path and begins exploring without the need for manual intervention via teleoperation.
  • ...and 10 more figures