Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience
Haixia Han, Tingyun Li, Shisong Chen, Jie Shi, Chengyu Du, Yanghua Xiao, Jiaqing Liang, Xin Lin
TL;DR
The paper tackles the problem of unreliable confidence in LLM outputs by introducing LePe, a three-stage framework (testing, learning, predicting) inspired by cognitive diagnostics to enable explicit confidence expression. It builds a comprehensive training-data pipeline using mutation-based question variants and answer sampling to construct confidence-labeled instruction data for fine-tuning, aiming to align predicted confidence with actual correctness. Experiments on CuteGPT-13B and LLaMA2-Chat-13B across four datasets show that LePe improves calibration (lower ECE) and yields strong correlations between confidence and correctness (e.g., $r$ up to $0.98$ on some tasks), while enabling useful out-of-domain generalization. The approach provides a practical path to better human–LLM collaboration by exposing calibrated uncertainty and offering mechanisms to identify and address model weaknesses. Calibration is evaluated against a formal condition $P(\\\hat{y}=y \\mid conf=z) = z$ over $z \in [0,1]$, reinforcing the rigor of the method.
Abstract
Large Language Models (LLMs) have exhibited remarkable performance across various downstream tasks, but they may generate inaccurate or false information with a confident tone. One of the possible solutions is to empower the LLM confidence expression capability, in which the confidence expressed can be well-aligned with the true probability of the generated answer being correct. However, leveraging the intrinsic ability of LLMs or the signals from the output logits of answers proves challenging in accurately capturing the response uncertainty in LLMs. Therefore, drawing inspiration from cognitive diagnostics, we propose a method of Learning from Past experience (LePe) to enhance the capability for confidence expression. Specifically, we first identify three key problems: (1) How to capture the inherent confidence of the LLM? (2) How to teach the LLM to express confidence? (3) How to evaluate the confidence expression of the LLM? Then we devise three stages in LePe to deal with these problems. Besides, to accurately capture the confidence of an LLM when constructing the training data, we design a complete pipeline including question preparation and answer sampling. We also conduct experiments using the Llama family of LLMs to verify the effectiveness of our proposed method on four datasets.
