A Multi-Perspective Analysis of Memorization in Large Language Models
Bowen Chen, Namgi Han, Yusuke Miyao
TL;DR
The paper investigates memorization in large language models (LLMs) from multiple angles to uncover how, when, and why memorized content emerges. It introduces a formal memorization criterion $M(X,Y)$ and a prediction framework using token- and sentence-level metrics, across a range of Pythia models from 70M to 12B parameters. Key findings include non-linear memorization scaling with model size and context, boundary effects in input and decoding dynamics, embedding-space clustering indicating paraphrase memorization, and the feasibility of predicting memorization with a Transformer. These results advance understanding of memorization mechanics and have implications for privacy, data contamination, and safer LLM deployment through improved anticipation of memorized content.
Abstract
Large Language Models (LLMs), trained on massive corpora with billions of parameters, show unprecedented performance in various fields. Though surprised by their excellent performances, researchers also noticed some special behaviors of those LLMs. One of those behaviors is memorization, in which LLMs can generate the same content used to train them. Though previous research has discussed memorization, the memorization of LLMs still lacks explanation, especially the cause of memorization and the dynamics of generating them. In this research, we comprehensively discussed memorization from various perspectives and extended the discussion scope to not only just the memorized content but also less and unmemorized content. Through various studies, we found that: (1) Through experiments, we revealed the relation of memorization between model size, continuation size, and context size. Further, we showed how unmemorized sentences transition to memorized sentences. (2) Through embedding analysis, we showed the distribution and decoding dynamics across model size in embedding space for sentences with different memorization scores. The n-gram statistics analysis presents d (3) An analysis over n-gram and entropy decoding dynamics discovered a boundary effect when the model starts to generate memorized sentences or unmemorized sentences. (4)We trained a Transformer model to predict the memorization of different models, showing that it is possible to predict memorizations by context.
