Measurement Simplification in ρ-POMDP with Performance Guarantees
Tom Yotam, Vadim Indelman
TL;DR
This work tackles the computational explosion of belief-space planning in ρ-POMDPs by introducing observation-space partitioning to derive tractable, adaptive bounds on the expected information-theoretic reward. By partitioning future observations into subsets and employing a hierarchical partitioning tree, the authors derive upper and lower bounds on conditional entropy that converge to the true reward as the partition depth increases, while reducing per-step computation. The approach is specialized to Gaussian beliefs with a closed-form entropy bound using determinants of the augmented information matrix, and leverages efficient determinant updates via rAMDL. Empirical results in active SLAM show substantial planning-speedups and guaranteed performance, both in simulation and real-world experiments, compared to state-of-the-art methods like rAMDL and iSAM2. The framework is general, scalable, and extensible to other belief distributions and information-theoretic rewards, offering a practical path toward fast, guaranteed decision-making under uncertainty.
Abstract
Decision making under uncertainty is at the heart of any autonomous system acting with imperfect information. The cost of solving the decision making problem is exponential in the action and observation spaces, thus rendering it unfeasible for many online systems. This paper introduces a novel approach to efficient decision-making, by partitioning the high-dimensional observation space. Using the partitioned observation space, we formulate analytical bounds on the expected information-theoretic reward, for general belief distributions. These bounds are then used to plan efficiently while keeping performance guarantees. We show that the bounds are adaptive, computationally efficient, and that they converge to the original solution. We extend the partitioning paradigm and present a hierarchy of partitioned spaces that allows greater efficiency in planning. We then propose a specific variant of these bounds for Gaussian beliefs and show a theoretical performance improvement of at least a factor of 4. Finally, we compare our novel method to other state of the art algorithms in active SLAM scenarios, in simulation and in real experiments. In both cases we show a significant speed-up in planning with performance guarantees.
