Uncertainty-Aware Fusion: An Ensemble Framework for Mitigating Hallucinations in Large Language Models
Prasenjit Dey, Srujana Merugu, Sivaramakrishnan Kaveri
TL;DR
This work addresses hallucinations in large language models by introducing Uncertainty-Aware Fusion (UAF), an ensemble framework that selects a subset of models based on accuracy and self-assessed uncertainty and then fuses their outputs for a final answer in factoid QA tasks. The SELECTOR module uses a combined score $Cscore_j = Acc_j \times SAH_j$ to pick the top $K$ models, while the FUSER performs instance-level fusion using criteria like $f^k = Acc_k \times (1 - u_{test}^k)$. Across TruthfulQA, TriviaQA, and FACTOR datasets, UAF with Haloscope generally outperforms state-of-the-art baselines by about 8% in accuracy and narrows or surpasses GPT-4 on several benchmarks, with ablations highlighting the importance of an appropriately sized ensemble. The results demonstrate that diversity in LLM strengths, when managed with uncertainty-aware selection, yields robust improvements in factuality and hallucination detection. Practical impact includes a scalable, training-free approach to improve reliability of LLM-powered QA systems in real-world settings.
Abstract
Large Language Models (LLMs) are known to hallucinate and generate non-factual outputs which can undermine user trust. Traditional methods to directly mitigate hallucinations, such as representation editing and contrastive decoding, often require additional training data and involve high implementation complexity. While ensemble-based approaches harness multiple LLMs to tap into the "wisdom of crowds", these methods overlook uncertainties in individual model responses. Recent studies reveal that uncertainty estimation can enable LLMs to self-assess the likelihood of generating hallucinations. In this work, we focus on factoid question answering (QA) and observe that LLMs accuracy and self-assessment capabilities vary widely with different models excelling in different scenarios. Leveraging this insight, we propose Uncertainty-Aware Fusion (UAF), an ensemble framework to reduces hallucinations by strategically combining multiple LLM based on their accuracy and self-assessment abilities. Empirical results on several public benchmark datasets show that UAF outperforms state-of-the-art hallucination mitigation methods by $8\%$ in factual accuracy, while either narrowing or surpassing the performance gap with GPT-4.
