Table of Contents
Fetching ...

Information Seeking for Robust Decision Making under Partial Observability

Djengo Cyun-Jyun Fang, Tsung-Wei Ke

TL;DR

The paper tackles robust decision making under partial observability and noisy dynamics by linking internal model alignment to explicit information seeking. It introduces InfoSeeker, an LLM-based agent that interleaves targeted information-seeking actions with task planning in a closed loop, aiming to realign its internal dynamics with the environment. A novel text-based benchmark evaluates planning under both observation and dynamics uncertainty, and results show a substantial 74% absolute improvement over prior methods with strong generalization across LLMs and tasks, while preserving sample efficiency. The work also formalizes connections between LLM planning and POMDPs and highlights the practical impact of explicit information seeking for real-world robustness in uncertain environments.

Abstract

Explicit information seeking is essential to human problem-solving in practical environments characterized by incomplete information and noisy dynamics. When the true environmental state is not directly observable, humans seek information to update their internal dynamics and inform future decision-making. Although existing Large Language Model (LLM) planning agents have addressed observational uncertainty, they often overlook discrepancies between their internal dynamics and the actual environment. We introduce Information Seeking Decision Planner (InfoSeeker), an LLM decision-making framework that integrates task-oriented planning with information seeking to align internal dynamics and make optimal decisions under uncertainty in both agent observations and environmental dynamics. InfoSeeker prompts an LLM to actively gather information by planning actions to validate its understanding, detect environmental changes, or test hypotheses before generating or revising task-oriented plans. To evaluate InfoSeeker, we introduce a novel benchmark suite featuring partially observable environments with incomplete observations and uncertain dynamics. Experiments demonstrate that InfoSeeker achieves a 74% absolute performance gain over prior methods without sacrificing sample efficiency. Moreover, InfoSeeker generalizes across LLMs and outperforms baselines on established benchmarks such as robotic manipulation and web navigation. These findings underscore the importance of tightly integrating planning and information seeking for robust behavior in partially observable environments. The project page is available at https://infoseekerllm.github.io

Information Seeking for Robust Decision Making under Partial Observability

TL;DR

The paper tackles robust decision making under partial observability and noisy dynamics by linking internal model alignment to explicit information seeking. It introduces InfoSeeker, an LLM-based agent that interleaves targeted information-seeking actions with task planning in a closed loop, aiming to realign its internal dynamics with the environment. A novel text-based benchmark evaluates planning under both observation and dynamics uncertainty, and results show a substantial 74% absolute improvement over prior methods with strong generalization across LLMs and tasks, while preserving sample efficiency. The work also formalizes connections between LLM planning and POMDPs and highlights the practical impact of explicit information seeking for real-world robustness in uncertain environments.

Abstract

Explicit information seeking is essential to human problem-solving in practical environments characterized by incomplete information and noisy dynamics. When the true environmental state is not directly observable, humans seek information to update their internal dynamics and inform future decision-making. Although existing Large Language Model (LLM) planning agents have addressed observational uncertainty, they often overlook discrepancies between their internal dynamics and the actual environment. We introduce Information Seeking Decision Planner (InfoSeeker), an LLM decision-making framework that integrates task-oriented planning with information seeking to align internal dynamics and make optimal decisions under uncertainty in both agent observations and environmental dynamics. InfoSeeker prompts an LLM to actively gather information by planning actions to validate its understanding, detect environmental changes, or test hypotheses before generating or revising task-oriented plans. To evaluate InfoSeeker, we introduce a novel benchmark suite featuring partially observable environments with incomplete observations and uncertain dynamics. Experiments demonstrate that InfoSeeker achieves a 74% absolute performance gain over prior methods without sacrificing sample efficiency. Moreover, InfoSeeker generalizes across LLMs and outperforms baselines on established benchmarks such as robotic manipulation and web navigation. These findings underscore the importance of tightly integrating planning and information seeking for robust behavior in partially observable environments. The project page is available at https://infoseekerllm.github.io

Paper Structure

This paper contains 31 sections, 3 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overcoming uncertainty through information seeking.(a) When tasked to move robot gripper to a target location using miscalibrated controllers that introduce a constant (1, 0) offset to every command, existing planners (bottom) fail by over-relying on presumed dynamics without verifying them. In contrast, InfoSeeker (top) actively seeks information, detects discrepancies between commanded and executed motions, and updates its internal dynamics to generate a correct plan. (b) InfoSeeker validates its internal dynamics before planning, while (c) existing approaches depend solely on execution feedback and fixed assumptions. This difference enables InfoSeeker to succeed in environments with uncertain observations and dynamics.
  • Figure 2: System overview of InfoSeeker. Our framework integrates Information Seeking(top-right) and Task-Oriented Planning(bottom-left) in a closed-loop process. The agent formulates and executes strategies to acquire missing knowledge, addressing gaps in its internal dynamics before generating more effective task plans. This iterative approach, supported by the reasoning capabilities of LLMs, enables the agent to reduce uncertainty and enhance planning effectiveness.
  • Figure 3: Ablation study. Success rate ($\%$) versus interaction steps (left) and planning attempts (right) on the perturbedstack single block task. Combining information-seeking and information-extraction behaviors makes our InfoSeeker more efficient and effective.
  • Figure 4: Visualization of InfoSeeker and LLM3 (from scratch) in perturbed environments.(a) Robotic arm guidance to target $(1.0, 2.0)$ with a fixed action offset $(-0.1, -0.1)$. (b) Paint mixing for seagreen (yellow + black) using mislabeled tubes: the "red" tube contains white, "blue" contains red, "white" contains black, and "black" contains blue. (c) Navigate to ball at $(1, 0)$ and deliver it to goal location $(2, 0)$ under inverted action mappings ("left" moves right, "forward" moves backward). (d) Rearranging blocks to match a target configuration, starting with a hidden inventory (contains a red block). Across these perturbations, InfoSeeker adapts through information seeking and feedback, while LLM3 repeatedly failed due to persistent misinterpretation.
  • Figure 5: Performance with varying interaction steps (left) and planning attempts (right). The plots show the success rate ($\%$) of InfoSeeker and four LLM baselines yao2023reactsun2023adaplannerwang2024llm3 on the stack single block task, as interaction steps (left) and planning attempts (right) are varied. ReAct is included only in the interaction step analysis due to its lack of distinct planning attempts. The plots demonstrate that InfoSeeker effectively leverages increased resources, particularly in perturbed environments.
  • ...and 1 more figures