Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Pritika Ramu; Koustava Goswami; Apoorv Saxena; Balaji Vasan Srinivasan

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Pritika Ramu, Koustava Goswami, Apoorv Saxena, Balaji Vasan Srinivasan

TL;DR

A novel approach to the factual decomposition of generated answers for attribution is proposed, employing template-based in-context learning and integrates negative sampling during few-shot in-context learning for decomposition, enhancing the semantic understanding of both abstractive and extractive answers.

Abstract

Accurately attributing answer text to its source document is crucial for developing a reliable question-answering system. However, attribution for long documents remains largely unexplored. Post-hoc attribution systems are designed to map answer text back to the source document, yet the granularity of this mapping has not been addressed. Furthermore, a critical question arises: What exactly should be attributed? This involves identifying the specific information units within an answer that require grounding. In this paper, we propose and investigate a novel approach to the factual decomposition of generated answers for attribution, employing template-based in-context learning. To accomplish this, we utilize the question and integrate negative sampling during few-shot in-context learning for decomposition. This approach enhances the semantic understanding of both abstractive and extractive answers. We examine the impact of answer decomposition by providing a thorough examination of various attribution approaches, ranging from retrieval-based techniques to LLM-based attributors.

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

TL;DR

Abstract

Paper Structure (33 sections, 1 equation, 5 figures, 10 tables, 1 algorithm)

This paper contains 33 sections, 1 equation, 5 figures, 10 tables, 1 algorithm.

Introduction
Related Work
Method
Task Definition
Answer Decomposition
Definition
Revisiting Fine Grained Decomposition
Coarse Grained Decomposition (CoG)
Classifier
Attributors
Retrievers
Large Language Models
Datasets
Citation Verifiability Dataset
QASPER Dataset
...and 18 more sections

Figures (5)

Figure 1: An example from Verifiability dataset. The input to the post-hoc attribution system is the question, document and answer. The output is evidence sentences from the document. Text marked in red do not require attribution.
Figure 2: Pipeline for attribution: Answers are decomposed and sent to the attributor for identifying evidences.
Figure 3: Average number of decomposition per sentence using each method.
Figure 4: Screenshot of Microsoft Forms used for survey.
Figure 5: Human Annotation Error

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

TL;DR

Abstract

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Authors

TL;DR

Abstract

Table of Contents

Figures (5)