Table of Contents
Fetching ...

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

Ranganath Krishnan, Piyush Khanna, Omesh Tickoo

TL;DR

This work tackles the problem of unreliable uncertainty estimates in large language models, which can lead to harmless-looking but incorrect outputs. It introduces an uncertainty-aware causal language modeling (UA-CLM) loss grounded in decision theory, designed to maximize accuracy while aligning token-level uncertainty with correctness via the entropy term $H_i$ and a $\tanh$-scaled calibration. Empirically, UA-CLM improves uncertainty calibration across QA and VQA tasks, boosting hallucination detection, selective generation, and out-of-domain detection without sacrificing text quality, and it extends to vision-language tasks. The method demonstrates strong practical potential for generating trustworthy open-ended responses and guiding safe, uncertainty-aware decision-making in real-world AI systems, with clear avenues for future work in black-box settings and sentence-level calibration.

Abstract

Large language models (LLMs) have revolutionized the field of natural language processing with their impressive reasoning and question-answering capabilities. However, these models are sometimes prone to generating credible-sounding but incorrect information, a phenomenon known as LLM hallucinations. Reliable uncertainty estimation in LLMs is essential for fostering trust in their generated responses and serves as a critical tool for the detection and prevention of erroneous or hallucinated outputs. To achieve reliable and well-calibrated uncertainty quantification in open-ended and free-form natural language generation, we propose an uncertainty-aware fine-tuning approach for LLMs. This approach enhances the model's ability to provide reliable uncertainty estimates without compromising accuracy, thereby guiding them to produce more trustworthy responses. We introduce a novel uncertainty-aware causal language modeling loss function, grounded in the principles of decision theory. Through rigorous evaluation on multiple free-form question-answering datasets and models, we demonstrate that our uncertainty-aware fine-tuning approach yields better calibrated uncertainty estimates in natural language generation tasks than fine-tuning with the standard causal language modeling loss. Furthermore, the experimental results show that the proposed method significantly improves the model's ability to detect hallucinations and identify out-of-domain prompts.

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

TL;DR

This work tackles the problem of unreliable uncertainty estimates in large language models, which can lead to harmless-looking but incorrect outputs. It introduces an uncertainty-aware causal language modeling (UA-CLM) loss grounded in decision theory, designed to maximize accuracy while aligning token-level uncertainty with correctness via the entropy term and a -scaled calibration. Empirically, UA-CLM improves uncertainty calibration across QA and VQA tasks, boosting hallucination detection, selective generation, and out-of-domain detection without sacrificing text quality, and it extends to vision-language tasks. The method demonstrates strong practical potential for generating trustworthy open-ended responses and guiding safe, uncertainty-aware decision-making in real-world AI systems, with clear avenues for future work in black-box settings and sentence-level calibration.

Abstract

Large language models (LLMs) have revolutionized the field of natural language processing with their impressive reasoning and question-answering capabilities. However, these models are sometimes prone to generating credible-sounding but incorrect information, a phenomenon known as LLM hallucinations. Reliable uncertainty estimation in LLMs is essential for fostering trust in their generated responses and serves as a critical tool for the detection and prevention of erroneous or hallucinated outputs. To achieve reliable and well-calibrated uncertainty quantification in open-ended and free-form natural language generation, we propose an uncertainty-aware fine-tuning approach for LLMs. This approach enhances the model's ability to provide reliable uncertainty estimates without compromising accuracy, thereby guiding them to produce more trustworthy responses. We introduce a novel uncertainty-aware causal language modeling loss function, grounded in the principles of decision theory. Through rigorous evaluation on multiple free-form question-answering datasets and models, we demonstrate that our uncertainty-aware fine-tuning approach yields better calibrated uncertainty estimates in natural language generation tasks than fine-tuning with the standard causal language modeling loss. Furthermore, the experimental results show that the proposed method significantly improves the model's ability to detect hallucinations and identify out-of-domain prompts.

Paper Structure

This paper contains 32 sections, 4 equations, 14 figures, 9 tables, 1 algorithm.

Figures (14)

  • Figure 1: The proposed Uncertainty-aware Causal Language Modeling (UA-CLM) outperforms standard Causal Language Modeling (CLM) in all four UQ metrics across various models. The performance is evaluated using AUROC for hallucination detection and AUARC for selective generation based on four distinct UQ metrics.
  • Figure 2: Uncertainty calibration analysis: Spearman's rank correlation coefficient and Pearson correlation coefficient between uncertainty estimates and generated text quality (ROUGE-L) scores for free-form open-ended question answering. Stronger negative correlation is desired for well-calibrated uncertainty quantification.
  • Figure 3: Accuracy versus Expected Calibration Error (ECE) comparison between UA-CLM, CLM, and pre-trained baseline across different LLM architectures on CoQA and TriviaQA datasets. The ideal model should have high accuracy and low ECE, indicating accurate predictions with well-calibrated uncertainty quantification (upper-left of the plot). The ECE of models fine-tuned with UA-CLM shows significant improvement compared to the pre-trained baseline and CLM fine-tuning.
  • Figure 4: Sample from CoQA reddy2019coqa illustrating the co-reference chain of conversational questions.
  • Figure 5: Data samples from OK-VQA marino2019ok across different knowledge categories.
  • ...and 9 more figures