Table of Contents
Fetching ...

Perception-Based Beliefs for POMDPs with Visual Observations

Miriam Schäfers, Merlijn Krale, Thiago D. Simão, Nils Jansen, Maximilian Weininger

TL;DR

Vision-based POMDPs are computationally challenging due to massive observation spaces. The authors introduce PBP, a modular framework that couples a perception model, trained on vision datasets, with belief-based POMDP solvers by a perception-based belief update that uses f(s_v|z_v) while factoring the observation function into vision and non-vision components. They prove that, with a perfect perception model, the PBP update recovers the standard belief update, and they augment it with uncertainty quantification (TUQ and WUQ) to boost robustness against visual corruption. Empirically, PBP instantiated with HSVI or POMCP outperforms end-to-end DRL baselines on VPOMDP benchmarks and remains robust to noisy visuals, highlighting the practical value of integrating perception with principled planning. The work enables leveraging off-the-shelf perception architectures and uncertainty techniques within established POMDP solvers, broadening the applicability of planning under uncertainty to real-world vision tasks.

Abstract

Partially observable Markov decision processes (POMDPs) are a principled planning model for sequential decision-making under uncertainty. Yet, real-world problems with high-dimensional observations, such as camera images, remain intractable for traditional belief- and filtering-based solvers. To tackle this problem, we introduce the Perception-based Beliefs for POMDPs framework (PBP), which complements such solvers with a perception model. This model takes the form of an image classifier which maps visual observations to probability distributions over states. PBP incorporates these distributions directly into belief updates, so the underlying solver does not need to reason explicitly over high-dimensional observation spaces. We show that the belief update of PBP coincides with the standard belief update if the image classifier is exact. Moreover, to handle classifier imprecision, we incorporate uncertainty quantification and introduce two methods to adjust the belief update accordingly. We implement PBP using two traditional POMDP solvers and empirically show that (1) it outperforms existing end-to-end deep RL methods and (2) uncertainty quantification improves robustness of PBP against visual corruption.

Perception-Based Beliefs for POMDPs with Visual Observations

TL;DR

Vision-based POMDPs are computationally challenging due to massive observation spaces. The authors introduce PBP, a modular framework that couples a perception model, trained on vision datasets, with belief-based POMDP solvers by a perception-based belief update that uses f(s_v|z_v) while factoring the observation function into vision and non-vision components. They prove that, with a perfect perception model, the PBP update recovers the standard belief update, and they augment it with uncertainty quantification (TUQ and WUQ) to boost robustness against visual corruption. Empirically, PBP instantiated with HSVI or POMCP outperforms end-to-end DRL baselines on VPOMDP benchmarks and remains robust to noisy visuals, highlighting the practical value of integrating perception with principled planning. The work enables leveraging off-the-shelf perception architectures and uncertainty techniques within established POMDP solvers, broadening the applicability of planning under uncertainty to real-world vision tasks.

Abstract

Partially observable Markov decision processes (POMDPs) are a principled planning model for sequential decision-making under uncertainty. Yet, real-world problems with high-dimensional observations, such as camera images, remain intractable for traditional belief- and filtering-based solvers. To tackle this problem, we introduce the Perception-based Beliefs for POMDPs framework (PBP), which complements such solvers with a perception model. This model takes the form of an image classifier which maps visual observations to probability distributions over states. PBP incorporates these distributions directly into belief updates, so the underlying solver does not need to reason explicitly over high-dimensional observation spaces. We show that the belief update of PBP coincides with the standard belief update if the image classifier is exact. Moreover, to handle classifier imprecision, we incorporate uncertainty quantification and introduce two methods to adjust the belief update accordingly. We implement PBP using two traditional POMDP solvers and empirically show that (1) it outperforms existing end-to-end deep RL methods and (2) uncertainty quantification improves robustness of PBP against visual corruption.
Paper Structure (68 sections, 2 theorems, 16 equations, 13 figures, 1 table, 1 algorithm)

This paper contains 68 sections, 2 theorems, 16 equations, 13 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Consider a vision POMDP satisfying ass:factoredObservationFunction, and let $f$ be a perfect perception model, i.e. $f(\textbf{s}_{\mathrm{v}} {\mid} \textbf{z}_{\mathrm{v}}) = \Pr(\textbf{s}_{\mathrm{v}} {\mid} \textbf{z}_{\mathrm{v}})$. Then, for every belief $b_t \in \Delta(\mathsf{S})$, the next

Figures (13)

  • Figure 1: A vision POMDP example: A car must determine the presence of an ambulance using an audio sensor and the traffic light's color by interpreting camera images.
  • Figure 2: An illustration (as Bayesian network representation) of the observation function of a VPOMDP under \ref{['ass:factoredObservationFunction']}. Notably, $\textbf{z}_{\mathrm{v}}'$ may only depend on the visual variables of $\textbf{s}'$, i.e. $\textbf{s}_{\mathrm{v}}'$, while $\textbf{z}_{\neg\mathrm{v}}$ can depend on all variables of $\textbf{s}'$.
  • Figure 3: Overview of the Perception-based Beliefs for POMDPs Framework (PBP). See \ref{['subsec:overall-framework']} for a detailed explanation.
  • Figure 4: Visualization of $\emph{FlowerGrid}$. For each cell, the set of possible observations corresponds to images of a particular class in the 102 Category Flower Dataset DBLP:conf/icvgip/NilsbackZ08. The magnifying glass shows two normal images, as well as an additive-noise (top right) and pure-noise image (bottom left) image.
  • Figure 5: Average discounted returns (Value) for different algorithms at different probabilities of receiving noisy observations. Additive noise refers to images that are correctly classified with $0.4$ probability, while full noise refers to pure salt-and-pepper images. Shaded areas show $95\%$ confidence in the value of the tested policy.
  • ...and 8 more figures

Theorems & Definitions (5)

  • Example 1
  • Definition 1: Vision POMDP
  • Definition 2: Vision dataset
  • Theorem 1: Soundness of \ref{['eq:beliefupdatefactoredFull']}
  • Theorem 1: Soundness of \ref{['eq:beliefupdatefactoredFull']}