Recursive Language Models

Alex L. Zhang; Tim Kraska; Omar Khattab

Recursive Language Models

Alex L. Zhang, Tim Kraska, Omar Khattab

TL;DR

The paper addresses the bottleneck of fixed context windows in large language models by introducing Recursive Language Models (RLMs), which treat prompts as external environment state and enable the root model to recursively query itself via a persistent REPL. This approach dramatically extends effective prompt length (to 10M+ tokens) and yields strong performance gains on diverse long-context tasks, with costs comparable to or lower than baselines. The authors provide extensive empirical evaluation across multiple benchmarks and frontier models, and they analyze emergent RLM trajectories such as code-based filtering and line-by-line sub-LM transformations. The work suggests a new direction for scaling long-context reasoning and motivates future research into training models to operate as RLMs with asynchronous execution and deeper recursion.

Abstract

We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference strategy that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs successfully handle inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of base LLMs and common long-context scaffolds across four diverse long-context tasks, while having comparable (or cheaper) cost per query.

Recursive Language Models

TL;DR

Abstract

Recursive Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)