Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

Mingyu Jin; Yutong Yin; Jingcheng Niu; Qingcheng Zeng; Wujiang Xu; Mengnan Du; Wei Cheng; Zhaoran Wang; Tianlong Chen; Dimitris N. Metaxas

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

Mingyu Jin, Yutong Yin, Jingcheng Niu, Qingcheng Zeng, Wujiang Xu, Mengnan Du, Wei Cheng, Zhaoran Wang, Tianlong Chen, Dimitris N. Metaxas

TL;DR

This study provides new mechanistic insights into how LLMs internalize OOD challenges, and designs Sparsity-Guided Curriculum In-Context Learning (SG-ICL), a strategy that explicitly uses representation sparsity to schedule few-shot demonstrations, leading to considerable performance enhancements.

Abstract

In this work, we investigate how Large Language Models (LLMs) adapt their internal representations when encountering inputs of increasing difficulty, quantified as the degree of out-of-distribution (OOD) shift. We reveal a consistent and quantifiable phenomenon: as task difficulty increases, whether through harder reasoning questions, longer contexts, or adding answer choices, the last hidden states of LLMs become substantially sparser. In short, \textbf{\textit{the farther the shift, the sparser the representations}}. This sparsity--difficulty relation is observable across diverse models and domains, suggesting that language models respond to unfamiliar or complex inputs by concentrating computation into specialized subspaces in the last hidden state. Through a series of controlled analyses with a learning dynamic explanation, we demonstrate that this sparsity is not incidental but an adaptive mechanism for stabilizing reasoning under OOD. Leveraging this insight, we design \textit{Sparsity-Guided Curriculum In-Context Learning (SG-ICL)}, a strategy that explicitly uses representation sparsity to schedule few-shot demonstrations, leading to considerable performance enhancements. Our study provides new mechanistic insights into how LLMs internalize OOD challenges. The source code is available at the URL: https://github.com/MingyuJ666/sparsityLLM.

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

TL;DR

Abstract

Paper Structure (62 sections, 11 theorems, 65 equations, 15 figures, 4 tables)

This paper contains 62 sections, 11 theorems, 65 equations, 15 figures, 4 tables.

Introduction
❶ Sparsity increases with difficulty in a robust and controlled manner.
❷ Learning dynamics connect density to familiarity.
❸ Sparsity-guided curricula improve reasoning.
RQ1: How does the geometry of the last hidden state evolve as reasoning tasks become increasingly difficult?
Sparsity vs Reasoning Complexity
Sparsity vs Answer Choice Expansion
Sparsity vs Knowledge Conflict
Sparsity vs Long Context Reasoning
RQ2: What mechanisms drive the emergence of sparsity when models face OOD challenges?
Controlled Environment
Preliminaries.
❶ Graph Construction.
❷ Sequence Serialization.
❸ Controlled Data Splitting.
...and 47 more sections

Key Result

Lemma 3.2

For all $t$ for which the trajectories are differentiable,

Figures (15)

Figure 1: Harder Inputs Induce Sparser Representations. Across all four controlled difficulty axes, the last hidden states become progressively sparser as tasks get harder. Results are shown for Qwen2.5-3B using Top-10% Energy; nevertheless, the same trend holds across difficulty settings, sparsity metrics, and LLM sizes.
Figure 2: Overview of Sparsity Analysis. Together, the two subfigures paint a consistent picture: \ref{['fig3']} (left) shows that difficulty increases sparsity, while \ref{['fig4']} (right) shows that sparsity tracks accuracy degradation.
Figure 3: Sparsity Metrics under Answer Choice Expansion. Bar plots show mean sparsity across 14 disciplines for five metrics under Normal (+0), Moderate Expansion (+5), and Large Expansion (+10) on Qwen2.5-3B. Error bars indicate the minimum and maximum across disciplines. Increasing task difficulty leads to higher sparsity.
Figure 4: Sparsity Differences under Knowledge Conflict. We measure the last hidden state sparsity for two conditions (non-conflict ( (0.2ex,0.2ex)) and conflict ( (0.2ex,0.2ex))) across five metrics for Qwen2.5-3B. All results are statistically significant. Arrows denote how each metric relates to sparsity ($\uparrow$: higher is sparser; $\downarrow$: lower is sparser). Again, the harder conflict ( (0.2ex,0.2ex)) condition is consistently sparser than the non-conflict ( (0.2ex,0.2ex)) condition across all metrics.
Figure 5: Layer-wise Sparsity across Context Lengths. While intermediate layers show minimal variation across contexts, the final layers exhibit sharp divergence: longer contexts consistently produce sparser representations. This experiment was done at LongReasonQA li2025longcontext, which can control the background context length.
...and 10 more figures

Theorems & Definitions (14)

Remark 3.1: A simple setting where \ref{['eq:appC_induced_h_dynamics']} holds exactly
Lemma 3.2: Exact drift identity
Lemma 3.3: Two-sided decay bound
Lemma 3.5: Uniform bound on $|D_\varepsilon|$ on Phase I
Lemma 3.6: Phase I dynamic
Corollary 3.7: Phase I decrease trend + certified hitting time
Remark 3.8: What Phase I does and does not claim
Lemma 3.10: Top-2 dominance from runner-up separation
Lemma 3.11: Diagonal negativity on $S$
Lemma 3.12: Uniform negativity of $D_\varepsilon$ on the Phase II window (with easy-complement control)
...and 4 more

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

TL;DR

Abstract

Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (14)