Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning

Fei Yu; Yingru Li; Benyou Wang

Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning

Fei Yu, Yingru Li, Benyou Wang

TL;DR

The paper tackles verifier failures in value-model guided search for language-model reasoning by introducing uncertainty-aware value modelling. It presents Uncertainty-Aware Value Models (UVMs) that output posterior value distributions via the Ensemble++ framework and a Group Thompson Sampling strategy to select candidates based on their probability of being optimal. Empirical results across ID and OOD benchmarks show improved solution coverage, particularly under distribution shift, with a noted trade-off in precision under majority voting. The work demonstrates robust, uncertainty-aware search with modest overhead and provides code to facilitate adoption and further research in uncertainty quantification for LLM search.

Abstract

Value model guided search is effective in steering LLM generation but suffers from a lack of robustness. This is due to verifier failure: imperfect VMs mistakenly prune valid reasoning paths, especially when encountering unseen reasoning paths generated during search. To address this, we propose an uncertainty-aware framework with two key components: (1) Uncertainty-Aware Value Models (UVMs), which replace single-point value estimates with value distributions to quantify prediction reliability, and (2) Group Thompson Sampling, an efficient algorithm that selects candidates based on their probability of being optimal. Experiments on two In-Distribution (ID) settings (GSM8K, MATH) and three Out-Of-Distribution (OOD) settings (e.g., AIME25, Minerva Math) show our method significantly mitigates verifier failure and boosts solution coverage, especially on OOD problems. This work provides the first systematic integration of uncertainty quantification into LLM search paradigms, enhancing robustness. The code is released at https://github.com/FreedomIntelligence/UVM.

Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning

TL;DR

Abstract

Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)