Table of Contents
Fetching ...

Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

Wenjing Tang, Xinyu He, Yongxi Huang, Yunxiao Xiao, Cewu Lu, Panpan Cai

TL;DR

Tru-POMDP is a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning, and introduces a hierarchical Tree of Hypotheses (TOH) that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses.

Abstract

Task planning under uncertainty is essential for home-service robots operating in the real world. Tasks involve ambiguous human instructions, hidden or unknown object locations, and open-vocabulary object types, leading to significant open-ended uncertainty and a boundlessly large planning space. To address these challenges, we propose Tru-POMDP, a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning. Tru-POMDP introduces a hierarchical Tree of Hypotheses (TOH), which systematically queries an LLM to construct high-quality particle beliefs over possible world states and human goals. We further formulate an open-ended POMDP model that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses. Experiments on complex object rearrangement tasks across diverse kitchen environments show that Tru-POMDP significantly outperforms state-of-the-art LLM-based and LLM-tree-search hybrid planners, achieving higher success rates with significantly better plans, stronger robustness to ambiguity and occlusion, and greater planning efficiency.

Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

TL;DR

Tru-POMDP is a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning, and introduces a hierarchical Tree of Hypotheses (TOH) that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses.

Abstract

Task planning under uncertainty is essential for home-service robots operating in the real world. Tasks involve ambiguous human instructions, hidden or unknown object locations, and open-vocabulary object types, leading to significant open-ended uncertainty and a boundlessly large planning space. To address these challenges, we propose Tru-POMDP, a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning. Tru-POMDP introduces a hierarchical Tree of Hypotheses (TOH), which systematically queries an LLM to construct high-quality particle beliefs over possible world states and human goals. We further formulate an open-ended POMDP model that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses. Experiments on complex object rearrangement tasks across diverse kitchen environments show that Tru-POMDP significantly outperforms state-of-the-art LLM-based and LLM-tree-search hybrid planners, achieving higher success rates with significantly better plans, stronger robustness to ambiguity and occlusion, and greater planning efficiency.

Paper Structure

This paper contains 34 sections, 6 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The architecture for Tru-POMDP. (a) Task Input: Human instruction and the observed scene graph. (b) Tree of Hypotheses: An LLM infers target objects, target areas, and initial locations, producing weighted particles. (c) Hybrid Belief Update: Bayesian filtering updates the belief using particle prediction and elimination, and augments the filtered belief with LLM particles. (d) Online POMDP Planning: Belief tree search computes the optimal action with the help of dynamic action branching and an LLM-written rollout policy.
  • Figure 2: Performance comparison of Tru-POMDP and baselines. Each bar represents the average value with standard error (SE). In (c), the dashed line indicates the maximum allowed step number.
  • Figure 3: Total tokens (k) used $\downarrow$ by Tru-POMDP and comparison baselines per episode.
  • Figure 4: Results for Ablation Study. Each bar shows average values with standard error (SE).
  • Figure 5: Total planning time $\downarrow$ used by Tru-POMDP and its ablated variants per episode.
  • ...and 2 more figures