Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges
Abhilasha Sancheti, Koustava Goswami, Balaji Vasan Srinivasan
TL;DR
This work defines a post-hoc answer attribution task for long document comprehension to enhance trustworthiness in information-seeking QA. It reformulates two existing datasets (Citation Verifiability and Hagrid) to enable evaluation and proposes ADiOSAA, a two-component system with an Answer Decomposer and a DocNLI-based attributor plus an optimal selection algorithm for sentence-level attributions. Empirical results show retrieval baselines excel at top-1 attributions, while ADiOSAA achieves higher precision when considering multiple attributions, especially with optimal selection; decomposing answers and using entailment improves performance on abstractive, multi-sentence attributions. The findings underscore the need for more abstractive long-form datasets to drive progress in trustworthy, verifiable QA systems and attribution mechanisms.
Abstract
Attributing answer text to its source document for information-seeking questions is crucial for building trustworthy, reliable, and accountable systems. We formulate a new task of post-hoc answer attribution for long document comprehension (LDC). Owing to the lack of long-form abstractive and information-seeking LDC datasets, we refactor existing datasets to assess the strengths and weaknesses of existing retrieval-based and proposed answer decomposition and textual entailment-based optimal selection attribution systems for this task. We throw light on the limitations of existing datasets and the need for datasets to assess the actual performance of systems on this task.
