Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents
Changdae Oh, Seongheon Park, To Eun Kim, Jiatong Li, Wendi Li, Samuel Yeh, Xuefeng Du, Hamed Hassani, Paul Bogdan, Dawn Song, Sharon Li
TL;DR
The paper argues that uncertainty quantification for LLMs must move beyond static, single-turn QA to agentic, interactive, long-horizon settings where actions, observations, and environment states unfold over time. It introduces a general mathematical formulation that models the agent's trajectory as a stochastic process and shows how existing UQ approaches are special cases, revealing a key limitation: traditional UQ treats uncertainty as monotonically accumulating. To address this, it proposes a conditional uncertainty reduction process with an information-gating mechanism that differentiates interactive, evidential actions from non-interactive ones, enabling reducible uncertainty and providing analytic bounds. The work discusses practical implications across frontier LLMs, healthcare, software engineering, and robotics, and outlines open problems—benchmarks, long-horizon estimation, and multi-agent dynamics—laying a foundation for safer, uncertainty-aware agentic systems. Overall, the framework provides principled guidance for designing uncertainty-aware LLM agents capable of interacting with users and tools while actively managing risk.
Abstract
Uncertainty quantification (UQ) for large language models (LLMs) is a key building block for safety guardrails of daily LLM applications. Yet, even as LLM agents are increasingly deployed in highly complex tasks, most UQ research still centers on single-turn question-answering. We argue that UQ research must shift to realistic settings with interactive agents, and that a new principled framework for agent UQ is needed. This paper presents the first general formulation of agent UQ that subsumes broad classes of existing UQ setups. Under this formulation, we show that prior works implicitly treat LLM UQ as an uncertainty accumulation process, a viewpoint that breaks down for interactive agents in an open world. In contrast, we propose a novel perspective, a conditional uncertainty reduction process, that explicitly models reducible uncertainty over an agent's trajectory by highlighting "interactivity" of actions. From this perspective, we outline a conceptual framework to provide actionable guidance for designing UQ in LLM agent setups. Finally, we conclude with practical implications of the agent UQ in frontier LLM development and domain-specific applications, as well as open remaining problems.
