Table of Contents
Fetching ...

Multi-Perspective Consistency Enhances Confidence Estimation in Large Language Models

Pei Wang, Yejie Wang, Muxi Diao, Keqing He, Guanting Dong, Weiran Xu

TL;DR

Considering the fragility of self-awareness in language models, a Multi-Perspective Consistency (MPC) method is introduced that can mitigate the problem of overconfidence and is effectively scalable to other models.

Abstract

In the deployment of large language models (LLMs), accurate confidence estimation is critical for assessing the credibility of model predictions. However, existing methods often fail to overcome the issue of overconfidence on incorrect answers. In this work, we focus on improving the confidence estimation of large language models. Considering the fragility of self-awareness in language models, we introduce a Multi-Perspective Consistency (MPC) method. We leverage complementary insights from different perspectives within models (MPC-Internal) and across different models (MPC-Across) to mitigate the issue of overconfidence arising from a singular viewpoint. The experimental results on eight publicly available datasets show that our MPC achieves state-of-the-art performance. Further analyses indicate that MPC can mitigate the problem of overconfidence and is effectively scalable to other models.

Multi-Perspective Consistency Enhances Confidence Estimation in Large Language Models

TL;DR

Considering the fragility of self-awareness in language models, a Multi-Perspective Consistency (MPC) method is introduced that can mitigate the problem of overconfidence and is effectively scalable to other models.

Abstract

In the deployment of large language models (LLMs), accurate confidence estimation is critical for assessing the credibility of model predictions. However, existing methods often fail to overcome the issue of overconfidence on incorrect answers. In this work, we focus on improving the confidence estimation of large language models. Considering the fragility of self-awareness in language models, we introduce a Multi-Perspective Consistency (MPC) method. We leverage complementary insights from different perspectives within models (MPC-Internal) and across different models (MPC-Across) to mitigate the issue of overconfidence arising from a singular viewpoint. The experimental results on eight publicly available datasets show that our MPC achieves state-of-the-art performance. Further analyses indicate that MPC can mitigate the problem of overconfidence and is effectively scalable to other models.
Paper Structure (27 sections, 3 equations, 8 figures, 5 tables)

This paper contains 27 sections, 3 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Confidence score distribution of GPT-4's incorrect samples in TruthfulQA. The horizontal axis represents the confidence scores predicted under this method, and the vertical axis represents the probability density. Theoretically, it is preferable for the entire distribution to shift left as far as possible.
  • Figure 2: The overall architecture of our Complementary Perspective Consistency for confidence estimation.
  • Figure 3: Confidence score distribution for GPT-4's incorrect samples in TruthfulQA. The horizontal axis represents the predicted confidence. The vertical axis represents the sample density. Note that the distribution shifted more to the left indicates better performance.
  • Figure 4: Confidence score distribution for GPT-4's correct samples in MedQA. The horizontal axis represents the predicted confidence scores. The vertical axis rep- resents the sample density. Note that the distribution shifted more to the right indicates better performance.
  • Figure 5: AUROC for MPC under different $\alpha$. We conduct different $\alpha$ values experiments on Business_Ethics and Anatomy. It shows that our method is robust on different $\alpha$.
  • ...and 3 more figures