Table of Contents
Fetching ...

Reason to Rote: Rethinking Memorization in Reasoning

Yupei Du, Philipp Mondorf, Silvia Casola, Yuekun Yao, Robert Litschko, Barbara Plank

TL;DR

The paper addresses how language models can memorize noisy training labels while maintaining generalizable reasoning. It employs two controlled synthetic tasks, Four-Digit Addition and Two-Hop Relational Reasoning, to dissect the interaction between memorization and generalization, revealing that memorization relies on existing reasoning mechanisms and distributed encodings rather than simple look-up. Key findings include a two-phase learning dynamic (generalize then memorize), strong overlap between generalization and memorization circuits, and a neuron-level mechanism in FDA termed outlier heuristics. The results illuminate benign memorization and show how reasoning components adapt to accommodate noisy data, with implications for understanding implicit regularization and designing robust models. Overall, memorization does not override reasoning but subtly reshapes it through distributed, architecture-aligned adaptations.

Abstract

Large language models readily memorize arbitrary training instances, such as label noise, yet they perform strikingly well on reasoning tasks. In this work, we investigate how language models memorize label noise, and why such memorization in many cases does not heavily affect generalizable reasoning capabilities. Using two controllable synthetic reasoning datasets with noisy labels, four-digit addition (FDA) and two-hop relational reasoning (THR), we discover a reliance of memorization on generalizable reasoning mechanisms: models continue to compute intermediate reasoning outputs even when retrieving memorized noisy labels, and intervening reasoning adversely affects memorization. We further show that memorization operates through distributed encoding, i.e., aggregating various inputs and intermediate results, rather than building a look-up mechanism from inputs to noisy labels. Moreover, our FDA case study reveals memorization occurs via outlier heuristics, where existing neuron activation patterns are slightly shifted to fit noisy labels. Together, our findings suggest that memorization of label noise in language models builds on, rather than overrides, the underlying reasoning mechanisms, shedding lights on the intriguing phenomenon of benign memorization.

Reason to Rote: Rethinking Memorization in Reasoning

TL;DR

The paper addresses how language models can memorize noisy training labels while maintaining generalizable reasoning. It employs two controlled synthetic tasks, Four-Digit Addition and Two-Hop Relational Reasoning, to dissect the interaction between memorization and generalization, revealing that memorization relies on existing reasoning mechanisms and distributed encodings rather than simple look-up. Key findings include a two-phase learning dynamic (generalize then memorize), strong overlap between generalization and memorization circuits, and a neuron-level mechanism in FDA termed outlier heuristics. The results illuminate benign memorization and show how reasoning components adapt to accommodate noisy data, with implications for understanding implicit regularization and designing robust models. Overall, memorization does not override reasoning but subtly reshapes it through distributed, architecture-aligned adaptations.

Abstract

Large language models readily memorize arbitrary training instances, such as label noise, yet they perform strikingly well on reasoning tasks. In this work, we investigate how language models memorize label noise, and why such memorization in many cases does not heavily affect generalizable reasoning capabilities. Using two controllable synthetic reasoning datasets with noisy labels, four-digit addition (FDA) and two-hop relational reasoning (THR), we discover a reliance of memorization on generalizable reasoning mechanisms: models continue to compute intermediate reasoning outputs even when retrieving memorized noisy labels, and intervening reasoning adversely affects memorization. We further show that memorization operates through distributed encoding, i.e., aggregating various inputs and intermediate results, rather than building a look-up mechanism from inputs to noisy labels. Moreover, our FDA case study reveals memorization occurs via outlier heuristics, where existing neuron activation patterns are slightly shifted to fit noisy labels. Together, our findings suggest that memorization of label noise in language models builds on, rather than overrides, the underlying reasoning mechanisms, shedding lights on the intriguing phenomenon of benign memorization.

Paper Structure

This paper contains 46 sections, 2 equations, 14 figures.

Figures (14)

  • Figure 1: Task data composition for FDA and THR. Both tasks are designed to have a small portion of noisy labels, to clearly separate generalization (clean validation set) from memorization (noisy training set).
  • Figure 2: Graph used to synthesize profiles for THR.
  • Figure 3: Generalization and memorization mechanisms co-exist on noisy training instances.
  • Figure 4: Memorization of noisy labels relies on generalizable reasoning mechanisms.
  • Figure 5: Memories follow distributed encodings across different input tokens and intermediate results.
  • ...and 9 more figures