Perception-Based Beliefs for POMDPs with Visual Observations
Miriam Schäfers, Merlijn Krale, Thiago D. Simão, Nils Jansen, Maximilian Weininger
TL;DR
Vision-based POMDPs are computationally challenging due to massive observation spaces. The authors introduce PBP, a modular framework that couples a perception model, trained on vision datasets, with belief-based POMDP solvers by a perception-based belief update that uses f(s_v|z_v) while factoring the observation function into vision and non-vision components. They prove that, with a perfect perception model, the PBP update recovers the standard belief update, and they augment it with uncertainty quantification (TUQ and WUQ) to boost robustness against visual corruption. Empirically, PBP instantiated with HSVI or POMCP outperforms end-to-end DRL baselines on VPOMDP benchmarks and remains robust to noisy visuals, highlighting the practical value of integrating perception with principled planning. The work enables leveraging off-the-shelf perception architectures and uncertainty techniques within established POMDP solvers, broadening the applicability of planning under uncertainty to real-world vision tasks.
Abstract
Partially observable Markov decision processes (POMDPs) are a principled planning model for sequential decision-making under uncertainty. Yet, real-world problems with high-dimensional observations, such as camera images, remain intractable for traditional belief- and filtering-based solvers. To tackle this problem, we introduce the Perception-based Beliefs for POMDPs framework (PBP), which complements such solvers with a perception model. This model takes the form of an image classifier which maps visual observations to probability distributions over states. PBP incorporates these distributions directly into belief updates, so the underlying solver does not need to reason explicitly over high-dimensional observation spaces. We show that the belief update of PBP coincides with the standard belief update if the image classifier is exact. Moreover, to handle classifier imprecision, we incorporate uncertainty quantification and introduce two methods to adjust the belief update accordingly. We implement PBP using two traditional POMDP solvers and empirically show that (1) it outperforms existing end-to-end deep RL methods and (2) uncertainty quantification improves robustness of PBP against visual corruption.
