Table of Contents
Fetching ...

A Comparative Analysis of LLM Memorization at Statistical and Internal Levels: Cross-Model Commonalities and Model-Specific Signatures

Bowen Chen, Namgi Han, Yusuke Miyao

Abstract

Memorization is a fundamental component of intelligence for both humans and LLMs. However, while LLM performance scales rapidly, our understanding of memorization lags. Due to limited access to the pre-training data of LLMs, most previous studies focus on a single model series, leading to isolated observations among series, making it unclear which findings are general or specific. In this study, we collect multiple model series (Pythia, OpenLLaMa, StarCoder, OLMo1/2/3) and analyze their shared or unique memorization behavior at both the statistical and internal levels, connecting individual observations while showing new findings. At the statistical level, we reveal that the memorization rate scales log-linearly with model size, and memorized sequences can be further compressed. Further analysis demonstrated a shared frequency and domain distribution pattern for memorized sequences. However, different models also show individual features under the above observations. At the internal level, we find that LLMs can remove certain injected perturbations, while memorized sequences are more sensitive. By decoding middle layers and attention head ablation, we revealed the general decoding process and shared important heads for memorization. However, the distribution of those important heads differs between families, showing a unique family-level feature. Through bridging various experiments and revealing new findings, this study paves the way for a universal and fundamental understanding of memorization in LLM.

A Comparative Analysis of LLM Memorization at Statistical and Internal Levels: Cross-Model Commonalities and Model-Specific Signatures

Abstract

Memorization is a fundamental component of intelligence for both humans and LLMs. However, while LLM performance scales rapidly, our understanding of memorization lags. Due to limited access to the pre-training data of LLMs, most previous studies focus on a single model series, leading to isolated observations among series, making it unclear which findings are general or specific. In this study, we collect multiple model series (Pythia, OpenLLaMa, StarCoder, OLMo1/2/3) and analyze their shared or unique memorization behavior at both the statistical and internal levels, connecting individual observations while showing new findings. At the statistical level, we reveal that the memorization rate scales log-linearly with model size, and memorized sequences can be further compressed. Further analysis demonstrated a shared frequency and domain distribution pattern for memorized sequences. However, different models also show individual features under the above observations. At the internal level, we find that LLMs can remove certain injected perturbations, while memorized sequences are more sensitive. By decoding middle layers and attention head ablation, we revealed the general decoding process and shared important heads for memorization. However, the distribution of those important heads differs between families, showing a unique family-level feature. Through bridging various experiments and revealing new findings, this study paves the way for a universal and fundamental understanding of memorization in LLM.
Paper Structure (34 sections, 4 equations, 13 figures, 5 tables)

This paper contains 34 sections, 4 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: (Top Figure) The relationship between memorization rate, model size, and pre-training corpus size. The x/y-axis represents parameter counts and memorization rate on a log scale. Line width and color also encode the training token count (thicker lines = more tokens). (Bottom left) The compression ratio across models. A smaller compression ratio requires fewer tokens to generate the same memorized sequences. (Bottom right) The probability density distribution of memorization scores for all models (log-scaled y-axis).
  • Figure 2: The average token frequency distribution for different models. The upper figure shows the normalized frequency for memorized sequences. The dotted vertical lines indicate the average frequency for each model. The bottom figure shows the absolute frequency for memorized (solid line) and unmemorized (dotted line) sequences.
  • Figure 3: (Left figure) Memorization score distribution shift under noise. (Right figure) The Attention Head similarity between clean and noised models, with variance represented by the shadow. Solid/dotted lines represent results for memorized/unmemorized sequences. The noise range is [0.1, 0.2, 0.3, 0.4, 0.5].
  • Figure 4: Decoding probability (Y-axis) for memorized and unmemorized tokens across different layers (X-axis) for all models. The solid/dotted line represents the probability for memorized/unmemorized sequences.
  • Figure 5: Distribution of shared important heads across layers for Pythia-12b and OLMo-13b
  • ...and 8 more figures