BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

Hengguan Huang; Xing Shen; Songtao Wang; Lingfa Meng; Dianbo Liu; David Alejandro Duchene; Hao Wang; Samir Bhatt

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

Hengguan Huang, Xing Shen, Songtao Wang, Lingfa Meng, Dianbo Liu, David Alejandro Duchene, Hao Wang, Samir Bhatt

TL;DR

This work addresses the challenge of capturing latent structure and uncertainty in LLM-based agent reasoning. It introduces Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian framework that guides LLMs to discover latent variables and dependencies through prompts and to perform prompting-based Bayesian inference, with predictions computed as $E_{P(Z|X)}[P(Y|Z)]$. The Bayesian-Enhanced variant, BayesVPGM, adds a Dirichlet posterior over predictions and a differentiable calibration loss to optimize a balancing parameter $\lambda$, improving confidence calibration. Across ScienceQA, ChatCoach, and A-OKVQA, the approach yields higher accuracy and better calibration than strong baselines, demonstrating scalable, uncertainty-aware latent-variable reasoning in multi-source, open-ended tasks.

Abstract

Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. While Large Language Model (LLM) agents demonstrate emergent reasoning and decision-making abilities, they lack a principled framework for capturing latent structures and modeling uncertainty. In this work, we explore for the first time how to bridge LLM agents with probabilistic graphical models (PGMs) to address agentic reasoning under uncertainty. To this end, we introduce Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian agentic framework that (i) guides LLM agents in following key principles of PGMs through natural language and (ii) refines the resulting posterior distributions via numerical Bayesian inference. Unlike many traditional probabilistic methods requiring substantial domain expertise, vPGM bypasses expert-driven model design, making it well-suited for scenarios with limited assumptions. We evaluated our model on several agentic reasoning tasks, both close-ended and open-ended. Our results indicate that the model effectively enhances confidence calibration and text generation quality.

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

TL;DR

. The Bayesian-Enhanced variant, BayesVPGM, adds a Dirichlet posterior over predictions and a differentiable calibration loss to optimize a balancing parameter

, improving confidence calibration. Across ScienceQA, ChatCoach, and A-OKVQA, the approach yields higher accuracy and better calibration than strong baselines, demonstrating scalable, uncertainty-aware latent-variable reasoning in multi-source, open-ended tasks.

Abstract

Paper Structure (29 sections, 1 theorem, 7 equations, 3 figures, 4 tables)

This paper contains 29 sections, 1 theorem, 7 equations, 3 figures, 4 tables.

Introduction
Related Work
LLM Prompting
LLM Agents and Agentic Systems
Concurrent Work
Our Method: Verbalized Probabilistic Graphical Modeling (vPGM)
Overview of vPGM
Graphical Structure Discovery
Prompting-Based Bayesian Inference
Prediction Under Uncertainty
Bayesian-Enhanced vPGM: BayesVPGM
Posterior Inference Under a Dirichlet Prior
Optimizing $\lambda$ via a Differentiable Calibration Loss
Experiments
Science Question Answering
...and 14 more sections

Key Result

Theorem 1

Let $\{(\mathbf{u}_i,y_i)\}_{i=1}^{n}$ be the training set with features $\mathbf{u}_i\in\mathbb{R}^{d}$ and one–hot labels $y_{ik}$. For any parameter vector $\theta$, let $g_{\theta}:\mathbb{R}^{d}\!\to\!\Delta^{K-1}$ be a function that produces class probabilities $\widehat{p}_{ik}(\theta)=g_{\th where $\beta>0, \bar{\widehat{p}}_{k}(\theta)=\tfrac{1}{n}\sum_{i}\widehat{p}_{ik}(\theta)$, and $\

Figures (3)

Figure 1: Example of inference using the BayesVPGM. The Chameleon framework erroneously assigns high confidence to the answer despite its LLM agents capturing irrelevant information. Conversely, our BayesVPGM accurately identifies this discrepancy and assigns low confidence. Here, we show a simplified inference prompt. See Appendix for detailed examples.
Figure 2: Overview of the vPGM's learning framework. CPDs represent conditional probability distributions. We omit the observed variable $\mathbf{X}$ for clarity.
Figure 3: Reliability diagrams of (a) Chameleon+ and (b) BayesVPGM ($N=3,M=3$) on ScienceQA (see the appendix for diagrams of Chameleon and vPGM). BayesVPGM achieve a much lower ECE comparing to Chameleon+ and approaches to the ideal confidence calibration curve (the diagonal dashed line).

Theorems & Definitions (1)

Theorem 1: Global Optimum Implies Perfect ECE

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

TL;DR

Abstract

BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (1)