LLM Foundation Models: February 2026 Week 6

Feb 5 – Feb 11, 2026 · 238 papers analyzed · 3 breakthroughs

Summary

238 LLM papers analyzed. 3 breakthroughs: (1) 2602.05896 proves $O(\sqrt{n})$ average sensitivity bound for 1-layer transformers, ruling out PARITY, and constructs 4-layer PARITY-capable transformer; (2) 2602.09276 introduces intrinsic dimensionality as predictor of CoT generalization ($\rho=0.93$), showing effective reasoning compresses task representation; (3) 2602.08100 reveals emergent search and backtracking in latent reasoning models with 34% accuracy gain from backtracking. Trends: transformer expressivity getting formal treatment, latent reasoning dynamics now observable, intrinsic dimensionality emerges as principled metric for reasoning strategy selection.

Key Takeaway

Week 6 bridges theory and practice: fundamental expressivity limits via sensitivity, intrinsic dimensionality as MDL-inspired metric for CoT quality, and first systematic observation of search dynamics in latent reasoning.

Breakthroughs (3)

1. Parity, Sensitivity, and Transformers

Why Novel: First theoretical characterization of transformer expressivity via sensitivity analysis. Proves fundamental lower bound on 1-layer architectures and provides constructive 4-layer PARITY solution with standard positional encoding.

Key Innovations:

Proves $as(f_n) = O(\sqrt{n})$ bound for 1-layer 1-head transformers via hyperplane-edge cutting argument
Corollary: No 1-layer 1-head transformer can compute PARITY (requires linear sensitivity)
Constructs 4-layer transformer computing PARITY with length-independent, polynomially bounded positional encoding
Works for both full attention and causally-masked attention models
Uses quantifier elimination in $(\mathbb{R}, +, <)$ for lower bound proof

Evidence:

— Lower bound theorem: 1-layer 1-head implies $as(f_n) = O(\sqrt{n})$
— Corollary: No 1-layer 1-head transformer computes PARITY
— Constructive theorem: 4-layer transformer for PARITY with standard-form embedding
— Key lemma: quantifier-free formula characterization of 1-layer output

Impact: Establishes formal complexity separation between shallow and deep transformers. Provides principled guidance for architecture design when tasks require high-sensitivity computation.

2. Effective Reasoning Chains Reduce Intrinsic Dimensionality

Why Novel: First use of intrinsic dimensionality as a quantitative lens on CoT effectiveness. Shows reasoning strategy quality predicts generalization via task compression, with Spearman $\rho=0.93$ on 4B model.

Key Innovations:

Intrinsic dimensionality (minimum trainable parameters for threshold accuracy) strongly predicts ID and OOD generalization
Executed PoT achieves lowest intrinsic dim (1.49M) and highest OOD performance (43.4%)
Distractor tokens increase intrinsic dim and degrade performance monotonically
Spearman $\rho=0.93$ (4B) and $\rho=0.75$ (1B) between intrinsic dim and overall accuracy
MDL-inspired explanation: effective reasoning compresses task into lower-dimensional representation

Evidence:

— Gemma-3 4B: intrinsic dim correlates $\rho=0.93$ with overall accuracy across 15 strategies
— Gemma-3 1B: intrinsic dim correlates $\rho=0.75$ , outperforming KL and length metrics
— Overview showing inverse correlation between intrinsic dimensionality and generalization
— Pareto frontier visualization of training accuracy vs trainable parameters

Impact: Provides principled, model-agnostic metric for evaluating and selecting reasoning strategies. Explains why some CoT variants generalize better via compression theory.

3. Emergent Search and Backtracking in Latent Reasoning Models

Why Novel: First systematic characterization of belief dynamics in latent reasoning transformers. Reveals structured search with exploration, commitment, and directed backtracking phases.

Key Innovations:

Latent reasoning exhibits exploration phase followed by shallow commitment and potential backtracking
Backtracking occurs in 32% of cases, improves accuracy by 34%
Backtracking is directed: abandoned answer is most similar distractor in 72% of cases
Task difficulty modulates deliberation: easy questions converge fast, no-correct-answer maintains high entropy
Entropy trajectory distinguishes task regimes without access to ground truth

Evidence:

— Representative trajectory: exploration, commitment to closest distractor (fish), backtrack to correct (echinoderm)
— Task difficulty modulates deliberation: easy converges fast, base explores, no-correct stays uncertain
— Entropy tracks task difficulty across recurrence steps
— Backtracking is directed: 72% abandon most similar distractor, 52% end at correct answer

Impact: Demonstrates latent computation can mirror CoT's corrective capabilities while enabling direct interpretability. Opens path to understanding and improving implicit reasoning.

Trends

Transformer expressivity getting formal treatment: PARITY bounds via sensitivity, hyperplane arguments
Latent reasoning dynamics now observable: belief trajectories, backtracking, entropy as difficulty signal
Intrinsic dimensionality emerges as principled metric for reasoning strategy selection
Resource rationality discovered as emergent property of inference-time scaling
Efficiency decomposition frameworks separating logic robustness from verbosity

Notable Papers (5)

1. Reasoning aligns language models to human cognition

Active probabilistic reasoning task separates evidence sampling from inference. Extended reasoning boosts inference by reducing biases, places LLMs in shared cognitive space with humans.

2. Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure

Intervention-based framework quantifies how latent states influence predictions. Reveals heterogeneous stepwise leverage and non-local information flow in Coconut and CODI.

3. Are More Tokens Rational? Inference-Time Scaling as Adaptive Resource Rationality

Variable Attribution Task shows resource rationality emerges from extended inference without explicit cost rewards. Identifies task-adaptive strategy shifts from permutation to elimination.

4. Decomposing Reasoning Efficiency in Large Language Models

Trace-optional framework decomposes efficiency into robustness, logic, and verbosity. Shows efficiency rankings diverge from accuracy, with verbalization overhead varying up to 9x.

5. Free(): Learning to Forget in Malloc-Only Reasoning Models

Identifies thinking token paradox where excessive tokens degrade performance. Proposes learnable forgetting mechanism to address reasoning model overthinking.

Honorable Mentions

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning ()
PACE: Defying the Scaling Hypothesis of Exploration in Iterative Alignment for Mathematical Reasoning ()
Inference-Time Rethinking with Latent Thought Vectors for Math Reasoning ()
Does Your Reasoning Model Implicitly Know When to Stop Thinking? ()
Difficulty-Estimated Policy Optimization ()