Table of Contents
Fetching ...

Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges

Abhilasha Sancheti, Koustava Goswami, Balaji Vasan Srinivasan

TL;DR

This work defines a post-hoc answer attribution task for long document comprehension to enhance trustworthiness in information-seeking QA. It reformulates two existing datasets (Citation Verifiability and Hagrid) to enable evaluation and proposes ADiOSAA, a two-component system with an Answer Decomposer and a DocNLI-based attributor plus an optimal selection algorithm for sentence-level attributions. Empirical results show retrieval baselines excel at top-1 attributions, while ADiOSAA achieves higher precision when considering multiple attributions, especially with optimal selection; decomposing answers and using entailment improves performance on abstractive, multi-sentence attributions. The findings underscore the need for more abstractive long-form datasets to drive progress in trustworthy, verifiable QA systems and attribution mechanisms.

Abstract

Attributing answer text to its source document for information-seeking questions is crucial for building trustworthy, reliable, and accountable systems. We formulate a new task of post-hoc answer attribution for long document comprehension (LDC). Owing to the lack of long-form abstractive and information-seeking LDC datasets, we refactor existing datasets to assess the strengths and weaknesses of existing retrieval-based and proposed answer decomposition and textual entailment-based optimal selection attribution systems for this task. We throw light on the limitations of existing datasets and the need for datasets to assess the actual performance of systems on this task.

Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges

TL;DR

This work defines a post-hoc answer attribution task for long document comprehension to enhance trustworthiness in information-seeking QA. It reformulates two existing datasets (Citation Verifiability and Hagrid) to enable evaluation and proposes ADiOSAA, a two-component system with an Answer Decomposer and a DocNLI-based attributor plus an optimal selection algorithm for sentence-level attributions. Empirical results show retrieval baselines excel at top-1 attributions, while ADiOSAA achieves higher precision when considering multiple attributions, especially with optimal selection; decomposing answers and using entailment improves performance on abstractive, multi-sentence attributions. The findings underscore the need for more abstractive long-form datasets to drive progress in trustworthy, verifiable QA systems and attribution mechanisms.

Abstract

Attributing answer text to its source document for information-seeking questions is crucial for building trustworthy, reliable, and accountable systems. We formulate a new task of post-hoc answer attribution for long document comprehension (LDC). Owing to the lack of long-form abstractive and information-seeking LDC datasets, we refactor existing datasets to assess the strengths and weaknesses of existing retrieval-based and proposed answer decomposition and textual entailment-based optimal selection attribution systems for this task. We throw light on the limitations of existing datasets and the need for datasets to assess the actual performance of systems on this task.
Paper Structure (22 sections, 1 figure, 6 tables, 1 algorithm)

This paper contains 22 sections, 1 figure, 6 tables, 1 algorithm.

Figures (1)

  • Figure 1: Overview of proposed answer attribution system, ADiOSAA. The answer decomposer breaks the given answer into information units, and the attributor finds the supporting sentences as attributions for each information unit in the answer.