Towards Map-Agnostic Policies for Adaptive Informative Path Planning
Julius Rückin, David Morilla-Cabello, Cyrill Stachniss, Eduardo Montijano, Marija Popović
TL;DR
The paper tackles online adaptive IPP in unknown terrains with varying map representations by introducing a map-agnostic formulation. It defines a unified planning state and a probabilistic belief over interesting areas, paired with a reward that combines uncertainty reduction and ROI probability, enabling training a single policy $\pi^*$ via PPO conditioned on mission hyperparameters $\mathcal H$. The approach achieves competitive performance with state-of-the-art map-specific methods on simulated and real-world datasets and reduces replanning runtime, while enabling seamless integration with existing non-learning-based search algorithms. This work offers a practical pathway to deploy adaptive IPP policies across diverse missions without map-specific retraining, with open-source code to foster community use.
Abstract
Robots are frequently tasked to gather relevant sensor data in unknown terrains. A key challenge for classical path planning algorithms used for autonomous information gathering is adaptively replanning paths online as the terrain is explored given limited onboard compute resources. Recently, learning-based approaches emerged that train planning policies offline and enable computationally efficient online replanning performing policy inference. These approaches are designed and trained for terrain monitoring missions assuming a single specific map representation, which limits their applicability to different terrains. To address these issues, we propose a novel formulation of the adaptive informative path planning problem unified across different map representations, enabling training and deploying planning policies in a larger variety of monitoring missions. Experimental results validate that our novel formulation easily integrates with classical non-learning-based planning approaches while maintaining their performance. Our trained planning policy performs similarly to state-of-the-art map-specifically trained policies. We validate our learned policy on unseen real-world terrain datasets.
