Vectorized Online POMDP Planning
Marcus Hoerger, Muhammad Sudrajat, Hanna Kurniawati
TL;DR
The paper introduces VOPP, a fully vectorized online POMDP planner that runs entirely on GPUs by representing the belief tree as tensors and executing forward search and backups as batched, dependency-free operations. Building on PORPP, VOPP analytically solves parts of the optimization and focuses numerical effort on expectation estimation, enabling massive data-parallelism. Empirical results show VOPP achieving at least 20x (and in some cases over 100x) speedups over HyP-DESPOT across large state, action, and observation spaces, while maintaining or improving policy quality. The work demonstrates scalability to complex robotics scenarios, including crowd navigation, and releases the implementation as open source.
Abstract
Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful framework for planning under partial observability problems, capturing the stochastic effects of actions and the limited information available through noisy observations. POMDP solving could benefit tremendously from massive parallelization of today's hardware, but parallelizing POMDP solvers has been challenging. They rely on interleaving numerical optimization over actions with the estimation of their values, which creates dependencies and synchronization bottlenecks between parallel processes that can quickly offset the benefits of parallelization. In this paper, we propose Vectorized Online POMDP Planner (VOPP), a novel parallel online solver that leverages a recent POMDP formulation that analytically solves part of the optimization component, leaving only the estimation of expectations for numerical computation. VOPP represents all data structures related to planning as a collection of tensors and implements all planning steps as fully vectorized computations over this representation. The result is a massively parallel solver with no dependencies and synchronization bottlenecks between parallel computations. Experimental results indicate that VOPP is at least 20X more efficient in computing near-optimal solutions compared to an existing state-of-the-art parallel online solver.
