Table of Contents
Fetching ...

An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention

Madhusudan Ghosh, Rishabh Gupta

TL;DR

This work investigates zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.

Abstract

The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.

An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention

TL;DR

This work investigates zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.

Abstract

The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.
Paper Structure (25 sections, 12 equations, 1 figure, 3 tables)

This paper contains 25 sections, 12 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Comparison of length extrapolation techniques for code completion, categorized into Positional Encoding Based (e.g., RoPE, ReRoPE) and Efficient Attention Based methods (e.g., StreamingLLM, Paged Attention, Flash Attention). These approaches address the challenges of handling long code sequences in Transformer models.