Attention with Dependency Parsing Augmentation for Fine-Grained Attribution
Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo
TL;DR
The paper tackles the challenge of fine-grained attribution for retrieval-augmented generation (RAG) by moving beyond decoder-only, coarse-grained signals. It introduces AttnUnion, a set-union-based token-wise evidence aggregation, and Dependency Parsing augmentation to enrich target-span attributions using attention-based similarity $\mathbf{S}$, while addressing practical constraints such as inaccessible attention weights and memory constraints with a memory-efficient computation of $\mathbf{S}$ and approximations from open-source LLMs. Empirical results on QuoteSum and VERI-GRAN show that AttnUnionDep achieves new state-of-the-art accuracy for fine-grained attribution and generalizes to sentence-level attribution, with strong faithfulness to generators and favorable latency. The method offers practical benefits for real-time attribution systems and showcases robust performance across model backbones, while outlining limitations regarding target-span selection, language scope, and rule-based dependency augmentation for future work.
Abstract
To assist humans in efficiently validating RAG-generated content, developing a fine-grained attribution mechanism that provides supporting evidence from retrieved documents for every answer span is essential. Existing fine-grained attribution methods rely on model-internal similarity metrics between responses and documents, such as saliency scores and hidden state similarity. However, these approaches suffer from either high computational complexity or coarse-grained representations. Additionally, a common problem shared by the previous works is their reliance on decoder-only Transformers, limiting their ability to incorporate contextual information after the target span. To address the above problems, we propose two techniques applicable to all model-internals-based methods. First, we aggregate token-wise evidence through set union operations, preserving the granularity of representations. Second, we enhance the attributor by integrating dependency parsing to enrich the semantic completeness of target spans. For practical implementation, our approach employs attention weights as the similarity metric. Experimental results demonstrate that the proposed method consistently outperforms all prior works.
