Online Planning in POMDPs with State-Requests
Raphael Avalos, Eugenio Bargiacchi, Ann Nowé, Diederik M. Roijers, Frans A. Oliehoek
TL;DR
The paper addresses planning under partial observability when full state information can be obtained at a cost, introducing POMDP-SR and the online planner AEMS-SR. AEMS-SR leverages a rooted cyclic graph to mitigate exponential tree growth caused by state-requests and provides theoretical guarantees of completeness and $\varepsilon$-optimality. It formalizes the POMDP-SR framework, analyzes equivalent POMDP transformations, and adapts upper-bound strategies (including $Q$-MDP and FIB-SR) for SR scenarios with online bound refinement via corner beliefs. Empirical results on RobotDelivery and Tag show that AEMS-SR consistently outperforms POMCP and traditional AEMS, particularly when bounds can be improved online. The work demonstrates practical benefits for real-world domains where state queries are costly yet advantageous for decision quality, and outlines avenues for tailored policy design and broader applicability of AEMS-Loop.
Abstract
In key real-world problems, full state information is sometimes available but only at a high cost, like activating precise yet energy-intensive sensors or consulting humans, thereby compelling the agent to operate under partial observability. For this scenario, we propose AEMS-SR (Anytime Error Minimization Search with State Requests), a principled online planning algorithm tailored for POMDPs with state requests. By representing the search space as a graph instead of a tree, AEMS-SR avoids the exponential growth of the search space originating from state requests. Theoretical analysis demonstrates AEMS-SR's $\varepsilon$-optimality, ensuring solution quality, while empirical evaluations illustrate its effectiveness compared with AEMS and POMCP, two SOTA online planning algorithms. AEMS-SR enables efficient planning in domains characterized by partial observability and costly state requests offering practical benefits across various applications.
