Table of Contents
Fetching ...

Towards Map-Agnostic Policies for Adaptive Informative Path Planning

Julius Rückin, David Morilla-Cabello, Cyrill Stachniss, Eduardo Montijano, Marija Popović

TL;DR

The paper tackles online adaptive IPP in unknown terrains with varying map representations by introducing a map-agnostic formulation. It defines a unified planning state and a probabilistic belief over interesting areas, paired with a reward that combines uncertainty reduction and ROI probability, enabling training a single policy $\pi^*$ via PPO conditioned on mission hyperparameters $\mathcal H$. The approach achieves competitive performance with state-of-the-art map-specific methods on simulated and real-world datasets and reduces replanning runtime, while enabling seamless integration with existing non-learning-based search algorithms. This work offers a practical pathway to deploy adaptive IPP policies across diverse missions without map-specific retraining, with open-source code to foster community use.

Abstract

Robots are frequently tasked to gather relevant sensor data in unknown terrains. A key challenge for classical path planning algorithms used for autonomous information gathering is adaptively replanning paths online as the terrain is explored given limited onboard compute resources. Recently, learning-based approaches emerged that train planning policies offline and enable computationally efficient online replanning performing policy inference. These approaches are designed and trained for terrain monitoring missions assuming a single specific map representation, which limits their applicability to different terrains. To address these issues, we propose a novel formulation of the adaptive informative path planning problem unified across different map representations, enabling training and deploying planning policies in a larger variety of monitoring missions. Experimental results validate that our novel formulation easily integrates with classical non-learning-based planning approaches while maintaining their performance. Our trained planning policy performs similarly to state-of-the-art map-specifically trained policies. We validate our learned policy on unseen real-world terrain datasets.

Towards Map-Agnostic Policies for Adaptive Informative Path Planning

TL;DR

The paper tackles online adaptive IPP in unknown terrains with varying map representations by introducing a map-agnostic formulation. It defines a unified planning state and a probabilistic belief over interesting areas, paired with a reward that combines uncertainty reduction and ROI probability, enabling training a single policy via PPO conditioned on mission hyperparameters . The approach achieves competitive performance with state-of-the-art map-specific methods on simulated and real-world datasets and reduces replanning runtime, while enabling seamless integration with existing non-learning-based search algorithms. This work offers a practical pathway to deploy adaptive IPP policies across diverse missions without map-specific retraining, with open-source code to foster community use.

Abstract

Robots are frequently tasked to gather relevant sensor data in unknown terrains. A key challenge for classical path planning algorithms used for autonomous information gathering is adaptively replanning paths online as the terrain is explored given limited onboard compute resources. Recently, learning-based approaches emerged that train planning policies offline and enable computationally efficient online replanning performing policy inference. These approaches are designed and trained for terrain monitoring missions assuming a single specific map representation, which limits their applicability to different terrains. To address these issues, we propose a novel formulation of the adaptive informative path planning problem unified across different map representations, enabling training and deploying planning policies in a larger variety of monitoring missions. Experimental results validate that our novel formulation easily integrates with classical non-learning-based planning approaches while maintaining their performance. Our trained planning policy performs similarly to state-of-the-art map-specifically trained policies. We validate our learned policy on unseen real-world terrain datasets.

Paper Structure

This paper contains 13 sections, 8 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Robots perform continuous- or discrete-valued terrain feature monitoring missions, e.g. mapping surface temperature or urban semantics. We transform mission-specific terrain map representations, e.g. Gaussian processes or occupancy grid maps, into a novel unified state representation for adaptive IPP. In this way, we design and train a single map-agnostic planning policy applicable to largely varying terrain monitoring missions.
  • Figure 1: Comparison of state-of-the-art map-specifically designed and trained methods to our map-agnostic planning policy (RL-Ours) on simulated continuous- and discrete-valued terrain feature monitoring missions. Best average performances are marked in bold, second-best average performances are underlined if standard deviations in brackets overlap. Our map-agnostic policy performs best in case of Varying user-defined mission hyperparameters and similar to state-of-the-art adaptive IPP methods in case of Static mission hyperparameters.
  • Figure 2: Our unified belief $p(F(\mathbf{x}) \in \mathcal{F}_{I} \mid \hat{F}_t)$ over interesting areas $\mathbf{x} \in \xi_I$ for continuous- (left) and discrete-valued (right) terrain features. Grey areas are unknown with large map uncertainty. (Left) Posterior normal distributions inferred from a Gaussian process or Kalman filter map representation with an interesting value threshold $f_{th} = 0.6$. The unified belief is computed by the orange area under the curve, which is larger for known interesting areas than for unknown uncertain areas. (Right) The unified belief is given by the sum of posterior probability masses over interesting classes (orange) extracted from an occupancy map representation.
  • Figure 2: Comparison of state-of-the-art map-specifically designed and trained methods to our map-agnostic planning policy (RL-Ours) on real-world continuous-valued surface temperature (Temperature-1/2) and discrete-valued urban (Potsdam) and rural (RIT-18) semantic terrain datasets. Best average performances are marked in bold, second-best average performances are underlined if standard deviations in brackets overlap. Our map-agnostic policy performs similarly to state-of-the-art adaptive IPP methods.
  • Figure 3: Continuous (left) and discrete terrain feature fields (right).
  • ...and 1 more figures