Table of Contents
Fetching ...

A Survey on Transformer Context Extension: Approaches and Evaluation

Yijun Liu, Jinzheng Yu, Yang Xu, Zhongyang Li, Qingfu Zhu

TL;DR

The paper surveys Transformer-based long-context processing, identifying three core challenges (OOD extrapolation, Lost-in-the-Middle, and quadratic attention complexity) and proposing a four-category taxonomy—positional encoding, context compression, retrieval augmented, and attention pattern—to organize approaches. It systematically analyzes evaluation by data, tasks, and metrics across existing long-context benchmarks, highlighting issues such as knowledge leakage and the need for coherent, scalable benchmarks. The work highlights practical techniques (e.g., RoPE variants, memory/compression schemes, and retrieval-augmented methods) and discusses future directions including method integration, long-text generation, and improved evaluation paradigms. Collectively, it provides a comprehensive guide for researchers to navigate long-context Transformer research, from methodological taxonomies to robust evaluation design, while acknowledging current limitations and open problems.

Abstract

Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long context scenarios, the performance of LLMs degrades due to some challenges. To alleviate this phenomenon, there is a number of work proposed recently. In this survey, we first list the challenges of applying pre-trained LLMs to process long contexts. Then systematically review the approaches related to long context and propose our taxonomy categorizing them into four main types: positional encoding, context compression, retrieval augmented, and attention pattern. In addition to the approaches, we focus on the evaluation of long context, organizing relevant data, tasks, and metrics based on existing long context benchmarks. Finally, we summarize unresolved issues in the long context domain and put forward our views on future developments.

A Survey on Transformer Context Extension: Approaches and Evaluation

TL;DR

The paper surveys Transformer-based long-context processing, identifying three core challenges (OOD extrapolation, Lost-in-the-Middle, and quadratic attention complexity) and proposing a four-category taxonomy—positional encoding, context compression, retrieval augmented, and attention pattern—to organize approaches. It systematically analyzes evaluation by data, tasks, and metrics across existing long-context benchmarks, highlighting issues such as knowledge leakage and the need for coherent, scalable benchmarks. The work highlights practical techniques (e.g., RoPE variants, memory/compression schemes, and retrieval-augmented methods) and discusses future directions including method integration, long-text generation, and improved evaluation paradigms. Collectively, it provides a comprehensive guide for researchers to navigate long-context Transformer research, from methodological taxonomies to robust evaluation design, while acknowledging current limitations and open problems.

Abstract

Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long context scenarios, the performance of LLMs degrades due to some challenges. To alleviate this phenomenon, there is a number of work proposed recently. In this survey, we first list the challenges of applying pre-trained LLMs to process long contexts. Then systematically review the approaches related to long context and propose our taxonomy categorizing them into four main types: positional encoding, context compression, retrieval augmented, and attention pattern. In addition to the approaches, we focus on the evaluation of long context, organizing relevant data, tasks, and metrics based on existing long context benchmarks. Finally, we summarize unresolved issues in the long context domain and put forward our views on future developments.

Paper Structure

This paper contains 97 sections, 22 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Framework of survey. We first list three inherent challenges in Section \ref{['sec:challenges']}. And then we systematically review related approaches and propose a novel taxonomy with four major categories in Section \ref{['sec:approaches']}. Next, in Section \ref{['sec:evaluation']}, we organize the evaluation aspect from three perspectives: data, tasks, and metrics based on existing benchmarks. At last, we show our views on future roadmap and open problems in Section \ref{['sec:future']}.
  • Figure 2: Distribution of averaged input #words of datasets in each task. Consistent colors indicate identical categories. The color of each bar refers to the category of the task, with bars of the same color belonging to the same category.