Table of Contents
Fetching ...

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

Hengguan Huang, Xing Shen, Songtao Wang, Lingfa Meng, Dianbo Liu, David Alejandro Duchene, Hao Wang, Samir Bhatt

TL;DR

This work addresses the challenge of capturing latent structure and uncertainty in LLM-based agent reasoning. It introduces Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian framework that guides LLMs to discover latent variables and dependencies through prompts and to perform prompting-based Bayesian inference, with predictions computed as $E_{P(Z|X)}[P(Y|Z)]$. The Bayesian-Enhanced variant, BayesVPGM, adds a Dirichlet posterior over predictions and a differentiable calibration loss to optimize a balancing parameter $\lambda$, improving confidence calibration. Across ScienceQA, ChatCoach, and A-OKVQA, the approach yields higher accuracy and better calibration than strong baselines, demonstrating scalable, uncertainty-aware latent-variable reasoning in multi-source, open-ended tasks.

Abstract

Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. While Large Language Model (LLM) agents demonstrate emergent reasoning and decision-making abilities, they lack a principled framework for capturing latent structures and modeling uncertainty. In this work, we explore for the first time how to bridge LLM agents with probabilistic graphical models (PGMs) to address agentic reasoning under uncertainty. To this end, we introduce Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian agentic framework that (i) guides LLM agents in following key principles of PGMs through natural language and (ii) refines the resulting posterior distributions via numerical Bayesian inference. Unlike many traditional probabilistic methods requiring substantial domain expertise, vPGM bypasses expert-driven model design, making it well-suited for scenarios with limited assumptions. We evaluated our model on several agentic reasoning tasks, both close-ended and open-ended. Our results indicate that the model effectively enhances confidence calibration and text generation quality.

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

TL;DR

This work addresses the challenge of capturing latent structure and uncertainty in LLM-based agent reasoning. It introduces Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian framework that guides LLMs to discover latent variables and dependencies through prompts and to perform prompting-based Bayesian inference, with predictions computed as . The Bayesian-Enhanced variant, BayesVPGM, adds a Dirichlet posterior over predictions and a differentiable calibration loss to optimize a balancing parameter , improving confidence calibration. Across ScienceQA, ChatCoach, and A-OKVQA, the approach yields higher accuracy and better calibration than strong baselines, demonstrating scalable, uncertainty-aware latent-variable reasoning in multi-source, open-ended tasks.

Abstract

Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. While Large Language Model (LLM) agents demonstrate emergent reasoning and decision-making abilities, they lack a principled framework for capturing latent structures and modeling uncertainty. In this work, we explore for the first time how to bridge LLM agents with probabilistic graphical models (PGMs) to address agentic reasoning under uncertainty. To this end, we introduce Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian agentic framework that (i) guides LLM agents in following key principles of PGMs through natural language and (ii) refines the resulting posterior distributions via numerical Bayesian inference. Unlike many traditional probabilistic methods requiring substantial domain expertise, vPGM bypasses expert-driven model design, making it well-suited for scenarios with limited assumptions. We evaluated our model on several agentic reasoning tasks, both close-ended and open-ended. Our results indicate that the model effectively enhances confidence calibration and text generation quality.
Paper Structure (29 sections, 1 theorem, 7 equations, 3 figures, 4 tables)

This paper contains 29 sections, 1 theorem, 7 equations, 3 figures, 4 tables.

Key Result

Theorem 1

Let $\{(\mathbf{u}_i,y_i)\}_{i=1}^{n}$ be the training set with features $\mathbf{u}_i\in\mathbb{R}^{d}$ and one–hot labels $y_{ik}$. For any parameter vector $\theta$, let $g_{\theta}:\mathbb{R}^{d}\!\to\!\Delta^{K-1}$ be a function that produces class probabilities $\widehat{p}_{ik}(\theta)=g_{\th where $\beta>0, \bar{\widehat{p}}_{k}(\theta)=\tfrac{1}{n}\sum_{i}\widehat{p}_{ik}(\theta)$, and $\

Figures (3)

  • Figure 1: Example of inference using the BayesVPGM. The Chameleon framework erroneously assigns high confidence to the answer despite its LLM agents capturing irrelevant information. Conversely, our BayesVPGM accurately identifies this discrepancy and assigns low confidence. Here, we show a simplified inference prompt. See Appendix for detailed examples.
  • Figure 2: Overview of the vPGM's learning framework. CPDs represent conditional probability distributions. We omit the observed variable $\mathbf{X}$ for clarity.
  • Figure 3: Reliability diagrams of (a) Chameleon+ and (b) BayesVPGM ($N=3,M=3$) on ScienceQA (see the appendix for diagrams of Chameleon and vPGM). BayesVPGM achieve a much lower ECE comparing to Chameleon+ and approaches to the ideal confidence calibration curve (the diagonal dashed line).

Theorems & Definitions (1)

  • Theorem 1: Global Optimum Implies Perfect ECE