eXplainable Bayesian Multi-Perspective Generative Retrieval
EuiYul Song, Philhoon Oh, Sangryul Kim, James Thorne
TL;DR
This work tackles the lack of interpretability and unreliable uncertainty estimates in deterministic retrieval by integrating uncertainty calibration and explainability into the retrieval pipeline. It combines Bayesian deep learning techniques (e.g., Deep Ensemble, SWA, MC Dropout) with an explainable context reranker (LIME/SHAP) and introduces uncertainty-aware fusion in the decoder alongside multi-perspective retrieval that fuses GENRE and Re3val contexts. Key contributions include a Bayesian Context Reranker, an eXplainable Context Reranker, and a stochastic FiD pre-training approach with Jensen-Shannon Divergence, all shown to improve downstream reader accuracy on three KILT datasets without substantial training overhead. The results demonstrate improved robustness and grounding quality, enabling more reliable, interpretable, and cost-efficient knowledge-grounded language systems.
Abstract
Modern deterministic retrieval pipelines prioritize achieving state-of-the-art performance but often lack interpretability in decision-making. These models face challenges in assessing uncertainty, leading to overconfident predictions. To overcome these limitations, we integrate uncertainty calibration and interpretability into a retrieval pipeline. Specifically, we introduce Bayesian methodologies and multi-perspective retrieval to calibrate uncertainty within a retrieval pipeline. We incorporate techniques such as LIME and SHAP to analyze the behavior of a black-box reranker model. The importance scores derived from these explanation methodologies serve as supplementary relevance scores to enhance the base reranker model. We evaluate the resulting performance enhancements achieved through uncertainty calibration and interpretable reranking on Question Answering and Fact Checking tasks. Our methods demonstrate substantial performance improvements across three KILT datasets.
