Table of Contents
Fetching ...

Attention with Dependency Parsing Augmentation for Fine-Grained Attribution

Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo

TL;DR

The paper tackles the challenge of fine-grained attribution for retrieval-augmented generation (RAG) by moving beyond decoder-only, coarse-grained signals. It introduces AttnUnion, a set-union-based token-wise evidence aggregation, and Dependency Parsing augmentation to enrich target-span attributions using attention-based similarity $\mathbf{S}$, while addressing practical constraints such as inaccessible attention weights and memory constraints with a memory-efficient computation of $\mathbf{S}$ and approximations from open-source LLMs. Empirical results on QuoteSum and VERI-GRAN show that AttnUnionDep achieves new state-of-the-art accuracy for fine-grained attribution and generalizes to sentence-level attribution, with strong faithfulness to generators and favorable latency. The method offers practical benefits for real-time attribution systems and showcases robust performance across model backbones, while outlining limitations regarding target-span selection, language scope, and rule-based dependency augmentation for future work.

Abstract

To assist humans in efficiently validating RAG-generated content, developing a fine-grained attribution mechanism that provides supporting evidence from retrieved documents for every answer span is essential. Existing fine-grained attribution methods rely on model-internal similarity metrics between responses and documents, such as saliency scores and hidden state similarity. However, these approaches suffer from either high computational complexity or coarse-grained representations. Additionally, a common problem shared by the previous works is their reliance on decoder-only Transformers, limiting their ability to incorporate contextual information after the target span. To address the above problems, we propose two techniques applicable to all model-internals-based methods. First, we aggregate token-wise evidence through set union operations, preserving the granularity of representations. Second, we enhance the attributor by integrating dependency parsing to enrich the semantic completeness of target spans. For practical implementation, our approach employs attention weights as the similarity metric. Experimental results demonstrate that the proposed method consistently outperforms all prior works.

Attention with Dependency Parsing Augmentation for Fine-Grained Attribution

TL;DR

The paper tackles the challenge of fine-grained attribution for retrieval-augmented generation (RAG) by moving beyond decoder-only, coarse-grained signals. It introduces AttnUnion, a set-union-based token-wise evidence aggregation, and Dependency Parsing augmentation to enrich target-span attributions using attention-based similarity , while addressing practical constraints such as inaccessible attention weights and memory constraints with a memory-efficient computation of and approximations from open-source LLMs. Empirical results on QuoteSum and VERI-GRAN show that AttnUnionDep achieves new state-of-the-art accuracy for fine-grained attribution and generalizes to sentence-level attribution, with strong faithfulness to generators and favorable latency. The method offers practical benefits for real-time attribution systems and showcases robust performance across model backbones, while outlining limitations regarding target-span selection, language scope, and rule-based dependency augmentation for future work.

Abstract

To assist humans in efficiently validating RAG-generated content, developing a fine-grained attribution mechanism that provides supporting evidence from retrieved documents for every answer span is essential. Existing fine-grained attribution methods rely on model-internal similarity metrics between responses and documents, such as saliency scores and hidden state similarity. However, these approaches suffer from either high computational complexity or coarse-grained representations. Additionally, a common problem shared by the previous works is their reliance on decoder-only Transformers, limiting their ability to incorporate contextual information after the target span. To address the above problems, we propose two techniques applicable to all model-internals-based methods. First, we aggregate token-wise evidence through set union operations, preserving the granularity of representations. Second, we enhance the attributor by integrating dependency parsing to enrich the semantic completeness of target spans. For practical implementation, our approach employs attention weights as the similarity metric. Experimental results demonstrate that the proposed method consistently outperforms all prior works.

Paper Structure

This paper contains 27 sections, 4 equations, 7 figures, 9 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of fine-grained attribution, i.e., finding evidence from the retrieved documents for arbitrary target spans. Each highlighted span in the answer is a target, with evidence in the same background color.
  • Figure 2: An illustration of dependency parsing augmentation. Suppose the target span is the token "one". The method first finds the closest verb ancestor of "one", i.e., "earned", and then collects successors of "earned", excluding unrelated coordinating constituents "two million dollars" and "2013". The resulting augmentation tokens are in red. Lastly, the attribution of "one" is updated by summing the attributions of the augmentation tokens.
  • Figure 3: Results of ablating Dep. Here AU and HU represent AttnUnion and HSSUnion, respectively.
  • Figure 4: An illustration of reforming the coordinate structures, where the words framed by dash lines are coordinate components.
  • Figure 5: The validation accuracy of AttnUnionDep against the layer from which the attention weights are extracted (fixing $k = 2, \tau = 2$). For Qwen2 7B, $L = 28$, and $\lfloor L/2\rfloor +1=15$. For Llama2 7B, $L = 32$, , and $\lfloor L/2\rfloor +1 = 17$.
  • ...and 2 more figures