ADVICE: Answer-Dependent Verbalized Confidence Estimation
Ki Jung Seo, Sehun Lim, Taeuk Kim
TL;DR
The paper addresses overconfidence in verbalized confidence from large language models by identifying answer-independence as the key shortcoming. It introduces ADVICE, a fine-tuning framework that explicitly grounds confidence estimation in the model's answer through a contrastive, answer-aware objective combining $L_{LM}$, $L_{JSD}$, and $L_{Margin}$. Across open-weight LLMs and multiple QA benchmarks, ADVICE achieves well-calibrated confidence (lower $ECE$ and $NCE$) while preserving task performance and generalizing to out-of-distribution data, with more balanced confidence distributions. The work provides mechanistic insights into confidence verbalization and offers a practical approach to safer, more trustworthy verbalized confidence in LLMs.
Abstract
Recent progress in large language models (LLMs) has enabled them to express their confidence in natural language, enhancing transparency and reliability. However, their confidence often exhibits overconfidence, the cause of which remains poorly understood. In this work, we conduct a detailed analysis of the dynamics underlying verbalized confidence and identify answer-independence as a key factor, defined as the model's failure to condition confidence on its own answer. To address this, we propose ADVICE (Answer-Dependent Verbalized Confidence Estimation), a fine-tuning framework that facilitates answer-grounded confidence estimation. Extensive experiments show that ADVICE substantially improves confidence calibration while preserving task performance. Further analyses confirm that ADVICE strengthens answer-groundedness, leading to more balanced and well-calibrated confidence distributions. Our findings shed light on the origin of overconfidence and establish a framework for more trustworthy confidence verbalization.
