Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

Anirudh Phukan; Shwetha Somasundaram; Apoorv Saxena; Koustava Goswami; Balaji Vasan Srinivasan

Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

Anirudh Phukan, Shwetha Somasundaram, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan

TL;DR

This work introduces a novel method for attribution in contextual question answering, leveraging the hidden state representations of LLMs, and presents Verifiability-granular, an attribution dataset which has token level annotations for LLM generations in the contextual question answering setup.

Abstract

With the enhancement in the field of generative artificial intelligence (AI), contextual question answering has become extremely relevant. Attributing model generations to the input source document is essential to ensure trustworthiness and reliability. We observe that when large language models (LLMs) are used for contextual question answering, the output answer often consists of text copied verbatim from the input prompt which is linked together with "glue text" generated by the LLM. Motivated by this, we propose that LLMs have an inherent awareness from where the text was copied, likely captured in the hidden states of the LLM. We introduce a novel method for attribution in contextual question answering, leveraging the hidden state representations of LLMs. Our approach bypasses the need for extensive model retraining and retrieval model overhead, offering granular attributions and preserving the quality of generated answers. Our experimental results demonstrate that our method performs on par or better than GPT-4 at identifying verbatim copied segments in LLM generations and in attributing these segments to their source. Importantly, our method shows robust performance across various LLM architectures, highlighting its broad applicability. Additionally, we present Verifiability-granular, an attribution dataset which has token level annotations for LLM generations in the contextual question answering setup.

Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

TL;DR

Abstract

Paper Structure (29 sections, 3 equations, 14 figures, 3 tables)

This paper contains 29 sections, 3 equations, 14 figures, 3 tables.

Introduction
Related Work
Problem Statement
Proposed Work
Motivation
Methodology
Identifying extractive output tokens
Attributing Extractive Spans
Experimental Setup
Datasets
Metrics
Metrics for Sub-task 1
Metrics for Sub-task 2
Baselines
Baselines for Sub-task 1
...and 14 more sections

Figures (14)

Figure 1: Overall picture of our proposed methodology; our method utilizes hidden layer representations from both the document and answer to determine the attribution of answer tokens. Initially, we identify extractive answer tokens (§ 4.2.1) through a cosine similarity matrix between document and answer tokens. Subsequently, we map these tokens to document token sequences by identifying anchor tokens and generating candidates, later ranked based on their cosine similarity to achieve attribution. (§ 4.2.2)
Figure 2: Semi Extractive answers by LLMs
Figure 3: Performance of our method and baselines on the Veri-gran test set illustrated using the Precision-Recall curve.
Figure 4: Comparison of model token F1 performance across layers of different models, for identifying output tokens extracted from the document on QuoteSum train set.
Figure 5: Comparison of model accuracy across layers for attributing extractive spans on QuoteSum train set.
...and 9 more figures

Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

TL;DR

Abstract

Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

Authors

TL;DR

Abstract

Table of Contents

Figures (14)