Reasoning Models Better Express Their Confidence

Dongkeun Yoon; Seungone Kim; Sohee Yang; Sunkyoung Kim; Soyeon Kim; Yongil Kim; Eunbi Choi; Yireun Kim; Minjoon Seo

Reasoning Models Better Express Their Confidence

Dongkeun Yoon, Seungone Kim, Sohee Yang, Sunkyoung Kim, Soyeon Kim, Yongil Kim, Eunbi Choi, Yireun Kim, Minjoon Seo

TL;DR

The paper shows that reasoning models employing extended chain-of-thought (CoT) not only solve problems effectively but also express their confidence more accurately than non-reasoning counterparts. Through benchmarking six reasoning models across six datasets, it finds calibration gains in 33 of 36 settings, driven by slow-thinking behaviors that allow dynamic confidence updates during CoT. Ablation experiments demonstrate that non-linear reasoning, alternative exploration, and backtracking contribute to improved calibration, while even non-reasoning models gain calibration when prompted to slow-think via in-context learning. These findings highlight slow thinking as a key mechanism for producing trustworthy, uncertainty-aware LLMs and suggest practical prompts to improve calibration across diverse tasks.

Abstract

Despite their strengths, large language models (LLMs) often fail to communicate their confidence accurately, making it difficult to assess when they might be wrong and limiting their reliability. In this work, we demonstrate that reasoning models that engage in extended chain-of-thought (CoT) reasoning exhibit superior performance not only in problem-solving but also in accurately expressing their confidence. Specifically, we benchmark six reasoning models across six datasets and find that they achieve strictly better confidence calibration than their non-reasoning counterparts in 33 out of the 36 settings. Our detailed analysis reveals that these gains in calibration stem from the slow thinking behaviors of reasoning models (e.g., exploring alternative approaches and backtracking) which enable them to adjust their confidence dynamically throughout their CoT, making it progressively more accurate. In particular, we find that reasoning models become increasingly better calibrated as their CoT unfolds, a trend not observed in non-reasoning models. Moreover, removing slow thinking behaviors from the CoT leads to a significant drop in calibration. Lastly, we show that non-reasoning models also demonstrate enhanced calibration when simply guided to slow think via in-context learning, fully isolating slow thinking as the source of the calibration gains.

Reasoning Models Better Express Their Confidence

TL;DR

Abstract

Reasoning Models Better Express Their Confidence

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)