Table of Contents
Fetching ...

Learning to Explore using Active Neural SLAM

Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, Ruslan Salakhutdinov

TL;DR

This paper presents Active Neural SLAM (ANS), a modular, hierarchical navigation framework that blends a learned Neural SLAM with a Global policy and a Local policy connected via an analytical planner. By training the components separately within a classical navigation pipeline, ANS achieves robust exploration in realistic 3D environments and demonstrates strong transfer to real-world robotics and the PointGoal task, including significant sample-efficiency gains. Key contributions include a realistic actuation/sensor noise model, a Mapper + Pose Estimator Neural SLAM module, and empirical evidence of superior performance and generalization over end-to-end baselines. The approach advances practical autonomous exploration by leveraging learning where it benefits most while retaining the reliability and efficiency of traditional planning, with demonstrated impact on Habitat benchmarks and real-world transfer.

Abstract

This work presents a modular and hierarchical approach to learn policies for exploring 3D environments, called `Active Neural SLAM'. Our approach leverages the strengths of both classical and learning-based methods, by using analytical path planners with learned SLAM module, and global and local policies. The use of learning provides flexibility with respect to input modalities (in the SLAM module), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies). Such use of learning within each module retains its benefits, while at the same time, hierarchical decomposition and modular training allow us to sidestep the high sample complexities associated with training end-to-end policies. Our experiments in visually and physically realistic simulated 3D environments demonstrate the effectiveness of our approach over past learning and geometry-based approaches. The proposed model can also be easily transferred to the PointGoal task and was the winning entry of the CVPR 2019 Habitat PointGoal Navigation Challenge.

Learning to Explore using Active Neural SLAM

TL;DR

This paper presents Active Neural SLAM (ANS), a modular, hierarchical navigation framework that blends a learned Neural SLAM with a Global policy and a Local policy connected via an analytical planner. By training the components separately within a classical navigation pipeline, ANS achieves robust exploration in realistic 3D environments and demonstrates strong transfer to real-world robotics and the PointGoal task, including significant sample-efficiency gains. Key contributions include a realistic actuation/sensor noise model, a Mapper + Pose Estimator Neural SLAM module, and empirical evidence of superior performance and generalization over end-to-end baselines. The approach advances practical autonomous exploration by leveraging learning where it benefits most while retaining the reliability and efficiency of traditional planning, with demonstrated impact on Habitat benchmarks and real-world transfer.

Abstract

This work presents a modular and hierarchical approach to learn policies for exploring 3D environments, called `Active Neural SLAM'. Our approach leverages the strengths of both classical and learning-based methods, by using analytical path planners with learned SLAM module, and global and local policies. The use of learning provides flexibility with respect to input modalities (in the SLAM module), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies). Such use of learning within each module retains its benefits, while at the same time, hierarchical decomposition and modular training allow us to sidestep the high sample complexities associated with training end-to-end policies. Our experiments in visually and physically realistic simulated 3D environments demonstrate the effectiveness of our approach over past learning and geometry-based approaches. The proposed model can also be easily transferred to the PointGoal task and was the winning entry of the CVPR 2019 Habitat PointGoal Navigation Challenge.

Paper Structure

This paper contains 18 sections, 5 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Overview of our approach. The Neural SLAM module predicts a map and agent pose estimate from incoming RGB observations and sensor readings. This map and pose are used by a Global policy to output a long-term goal, which is converted to a short-term goal using an analytic path planner. A Local Policy is trained to navigate to this short-term goal.
  • Figure 2: Architecture of the Neural SLAM module: The Neural SLAM module ($f_{Map}$) takes in the current RGB observation, $s_t$, the current and last sensor reading of the agent pose $x_{t-1:t}'$, last agent pose estimate, $\hat{x}_{t-1}$ and the map at the previous time step $m_{t-1}$ and outputs an updated map, $m_{t}$ and the current agent pose estimate, $\hat{x}_t$. 'ST' denotes spatial transformation.
  • Figure 3: Plot showing the $\%$ Coverage as the episode progresses for ANS and the baselines on the large and small scenes in the Gibson Val set as well as the overall Gibson Val set.
  • Figure 4: Exploration visualization. Figure showing a sample trajectory of the Active Neural SLAM model in the Exploration task. Top: RGB observations seen by the agent. Inset: Global ground truth map and pose (not visible to the agent). Bottom: Local map and pose predictions. Long-term goals selected by the Global policy are shown by blue circles. The ground-truth map and pose are under-laid in grey. Map prediction is overlaid in green, with dark green denoting correct predictions and light green denoting false positives. Agent pose predictions are shown in red. The light blue shaded region shows the explored area.
  • Figure 5: Real-world Transfer.Left: Image showing the living area in an apartment used for the real-world experiments. Right: Sample images seen by the robot and the predicted map. The long-term goal selected by the Global Policy is shown by a blue circle on the map.
  • ...and 5 more figures