Table of Contents
Fetching ...

Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer

Haoyan Yang, Yixuan Wang, Xingyin Xu, Hanyuan Zhang, Yirong Bian

TL;DR

This work addresses the overconfidence bias in large language models by introducing a knowledge-transfer framework that leverages chain-of-thought reasoning. Large LLMs generate detailed CoTs and confidence signals, which are used to fine-tune smaller LLMs so they replicate advanced reasoning with calibrated confidence via Confidence-Calibrated Inference. Across multilingual tasks of multiple-choice and sentiment analysis, the KT approach substantially improves accuracy and calibration over vanilla and QA baselines, with notable gains on TruthfulQA and related datasets. The method demonstrates that transferring structured reasoning from big to small models can yield trustworthy, context-appropriate outputs, albeit with some limitations such as potential token inflation and self-dialogue risks.

Abstract

The study explores mitigating overconfidence bias in LLMs to improve their reliability. We introduce a knowledge transfer (KT) method utilizing chain of thoughts, where "big" LLMs impart knowledge to "small" LLMs via detailed, sequential reasoning paths. This method uses advanced reasoning of larger models to fine-tune smaller models, enabling them to produce more accurate predictions with calibrated confidence. Experimental evaluation using multiple-choice questions and sentiment analysis across diverse datasets demonstrated the KT method's superiority over the vanilla and question-answer pair (QA) fine-tuning methods. The most significant improvement in three key metrics, where the KT method outperformed the vanilla and QA methods by an average of 55.3% and 43.1%, respectively. These findings underscore the KT method's potential in enhancing model trustworthiness and accuracy, offering precise outputs with well-matched confidence levels across various contexts.

Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer

TL;DR

This work addresses the overconfidence bias in large language models by introducing a knowledge-transfer framework that leverages chain-of-thought reasoning. Large LLMs generate detailed CoTs and confidence signals, which are used to fine-tune smaller LLMs so they replicate advanced reasoning with calibrated confidence via Confidence-Calibrated Inference. Across multilingual tasks of multiple-choice and sentiment analysis, the KT approach substantially improves accuracy and calibration over vanilla and QA baselines, with notable gains on TruthfulQA and related datasets. The method demonstrates that transferring structured reasoning from big to small models can yield trustworthy, context-appropriate outputs, albeit with some limitations such as potential token inflation and self-dialogue risks.

Abstract

The study explores mitigating overconfidence bias in LLMs to improve their reliability. We introduce a knowledge transfer (KT) method utilizing chain of thoughts, where "big" LLMs impart knowledge to "small" LLMs via detailed, sequential reasoning paths. This method uses advanced reasoning of larger models to fine-tune smaller models, enabling them to produce more accurate predictions with calibrated confidence. Experimental evaluation using multiple-choice questions and sentiment analysis across diverse datasets demonstrated the KT method's superiority over the vanilla and question-answer pair (QA) fine-tuning methods. The most significant improvement in three key metrics, where the KT method outperformed the vanilla and QA methods by an average of 55.3% and 43.1%, respectively. These findings underscore the KT method's potential in enhancing model trustworthiness and accuracy, offering precise outputs with well-matched confidence levels across various contexts.
Paper Structure (21 sections, 1 equation, 6 figures, 5 tables)

This paper contains 21 sections, 1 equation, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Evaluation of overconfidence in LLMs using the TruthfulQA dataset. Graphs indicate a pronounced discrepancy between confidence and correctness. Especially, like LLaMA 2 and Vicuna exhibit instances of high confidence in incorrect answers (high ROB), underscoring a widespread overconfidence issue in current LLMs.
  • Figure 2: Schematic representation of the KT method between "big" and "small" LLMs. The diagram shows the role of big LLMs, such as GPT-4, as the "Teacher", where they generate detailed CoTs given a "Question" and a "Prompt". This output is then utilized to fine-tune small LLMs, exemplified by 'Vicuna-7B' as the "Student". The fine-tuned models then employ these insights to enhance their inference capabilities, yielding answers with quantifiable confidence levels.
  • Figure 3: Performance trends in fine-tuning the Vicuna-7B with different quantities of CoT on four datasets.
  • Figure 4: Calibration curves comparing model accuracy and confidence levels across two datasets in Vicuna.
  • Figure :
  • ...and 1 more figures