Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Taiming Lu; Muhan Gao; Kuai Yu; Adam Byerly; Daniel Khashabi

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi

TL;DR

This study explores LLMs' long-context reasoning by probing their hidden representations, finding that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses, revealing a disconnect between information retrieval and utilization.

Abstract

Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 10 figures)

This paper contains 26 sections, 1 equation, 10 figures.

Introduction
Related Work
Positional bias.
Probing.
Experimental Setup
Datasets and prompts.
Probing classifiers.
Models and hyperparameters.
Metrics.
LLMs Know but Don't Tell
Experiment: Peak Probing Accuracy Across LLM Layers
LLMs know but don't tell.
Experiment: Probing Across Layers
Middle-context information requires more layers to be located.
Experiment: Number of Layers Taken for Locating Target Information
...and 11 more sections

Figures (10)

Figure 1: Following prompts by liu2023lost, we train a probing classifier for each transformer layer to probe the model's ability to identify useful information. The peak accuracy among layers indicates the model's long-context processing effectiveness.
Figure 2: Accuracy of LLMs in directly generating answers (blue line) compared to the maximum probing accuracy across layers by our probing classifiers (red line). In both tasks, our probing classifiers surpass the model's generated answers across all gold positions. This highlights a distinction between knowing the context and utilizing it.
Figure 3: The probing accuracy for each layer in the two tasks: kv-pairs (left) and MDQA (right). Different colors represent the position of target information within the input context. In both tasks, extracting mid-context information requires more layers.
Figure 4: The LLM layer that achieves the peak probing accuracy ($x$-axis) vs. the accuracy of LLM in generating the correct answer ($y$-axis). We observe that a later peak correlates with lower accuracy in the language model's final output. This implies that the earlier an LLM encodes information from a specific index, the higher the accuracy of the final output for that position.
Figure 5: Replicating the results of Fig.\ref{['fig:probing_layer_section_3']} and Fig.\ref{['fig:probing_comparison_section_4']} using the Gemma model with 100 kv-pairs. The findings for this model also align with the observations in §\ref{['sec:4.1']} and §\ref{['sec:4:2']}. On the right, there is a notable gap between generation accuracy and peak probing accuracy, mirroring the results observed with Mistral in the main text.
...and 5 more figures

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

TL;DR

Abstract

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Authors

TL;DR

Abstract

Table of Contents

Figures (10)