Multi-Granularity Guided Fusion-in-Decoder
Eunseong Choi, Hyeri Lee, Jongwuk Lee
TL;DR
MGFiD tackles spurious evidence in open-domain QA by jointly learning evidence at coarse (passage) and fine (sentence) granularities and guiding decoding with an anchor vector. It combines passage re-ranking, sentence classification, and threshold-based pruning within a multi-task FiD framework, augmented by pseudo-labels from LLMs. The approach yields significant EM gains on NQ and moderate gains on TQA, while reducing decoder workload by pruning to a small, relevant subset of passages. This multi-granularity strategy improves both accuracy and decoding efficiency, advancing robust evidence discrimination in ODQA with practical deployment implications.
Abstract
In Open-domain Question Answering (ODQA), it is essential to discern relevant contexts as evidence and avoid spurious ones among retrieved results. The model architecture that uses concatenated multiple contexts in the decoding phase, i.e., Fusion-in-Decoder, demonstrates promising performance but generates incorrect outputs from seemingly plausible contexts. To address this problem, we propose the Multi-Granularity guided Fusion-in-Decoder (MGFiD), discerning evidence across multiple levels of granularity. Based on multi-task learning, MGFiD harmonizes passage re-ranking with sentence classification. It aggregates evident sentences into an anchor vector that instructs the decoder. Additionally, it improves decoding efficiency by reusing the results of passage re-ranking for passage pruning. Through our experiments, MGFiD outperforms existing models on the Natural Questions (NQ) and TriviaQA (TQA) datasets, highlighting the benefits of its multi-granularity solution.
