Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

Till Speicher; Mohammad Aflah Khan; Qinyuan Wu; Vedant Nanda; Soumi Das; Bishwamittra Ghosh; Krishna P. Gummadi; Evimaria Terzi

Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi

TL;DR

This work creates an experimental framework that is based on repeatedly exposing large language models to random strings and identifies factors that make some strings easier to memorise than others, and identifies the role of local prefixes and global context in memorisation.

Abstract

Understanding whether and to what extent large language models (LLMs) have memorised training data has important implications for the reliability of their output and the privacy of their training data. In order to cleanly measure and disentangle memorisation from other phenomena (e.g. in-context learning), we create an experimental framework that is based on repeatedly exposing LLMs to random strings. Our framework allows us to better understand the dynamics, i.e., the behaviour of the model, when repeatedly exposing it to random strings. Using our framework, we make several striking observations: (a) we find consistent phases of the dynamics across families of models (Pythia, Phi and Llama2), (b) we identify factors that make some strings easier to memorise than others, and (c) we identify the role of local prefixes and global context in memorisation. We also show that sequential exposition to different random strings has a significant effect on memorisation. Our results, often surprising, have significant downstream implications in the study and usage of LLMs.

Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

TL;DR

Abstract

Paper Structure (33 sections, 4 equations, 47 figures, 3 tables)

This paper contains 33 sections, 4 equations, 47 figures, 3 tables.

Introduction
Preliminaries and experimental setup
The dynamics of repeated exposure to random strings
Q1: Are some strings easier to memorise than others?
Q2: What information do models need to recall memorised tokens?
Q3: How do models behave when sequentially memorising random strings?
Conclusions and Limitations
Additional details on the experimental setup
Technical details on the training setup
Examples of random strings used in the paper
Computational resources
Additional results for the memorisation dynamics
Additional models and metrics
Results for non-Latin alphabets
Results for untrained models
...and 18 more sections

Figures (47)

Figure 1: [Recollection accuracy for different alphabet sizes $\ell$ and models ${\mathcal{M}}$. ($n = 1024$)] For all models, the accuracy initially increases quickly before stagnating at the random guess level during the Guessing-Phase. Afterwards, the accuracy converges more slowly towards $1$ during the Memorisation-Phase. The accuracy of randomly guessing tokens from $A$ is shown with dashed lines.
Figure 2: [Aggregate probability mass and entropy for different $\ell$. ($n = 1024$)] i) Plots on the top show the probability mass that ${\mathcal{M}}$ assigns to tokens in $A$. In all cases, models quickly learn to allocate the maximum possible probability mass to the tokens within the alphabet $A$, i.e. they only predict tokens from $A$ after a few training epochs. ii) We show the average entropy of the probability distribution of model ${\mathcal{M}}$ over $A$. The entropy initially rises to its maximum value, before decreasing to 0. The maximum attainable entropy (for different $\ell$) is shown with dashed lines.
Figure 3: [Recollection accuracy for different entropy levels $h$. ($n = 1024$)] Analogously to strings with different $\ell$, strings with lower $h$ are easier to guess, but harder to memorise. Dashed lines indicate the performance of a random guess, equivalent to always guessing "a".
Figure 4: [Recollection accuracy for different prefix lengths and for changes in the global context (GC) during training. ($n = 1024, {\mathcal{M}} = \text{Pythia-1B}$)] (a) and (b) show what fraction of tokens can be recollected correctly with different prefix lengths, at different points during training. In many cases, prefixes much shorter than the full string are sufficient to predict most of the tokens accurately. (c) shows the performance of a randomly re-sampled vs a constant global context with only one repeated token, and (d) shows the impact of changing the size of the global context, where the numbers indicate multiples of the GC size.
Figure 5: [Accuracy on different strings during sequential memorisation. ($n = 1024, {\mathcal{M}} = \text{Pythia-1B}$)] Each curve denotes a new string. As the model memorises new strings, it forgets old ones, shown by the drop in accuracy after the first 50 epochs per string.
...and 42 more figures

Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

TL;DR

Abstract

Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

Authors

TL;DR

Abstract

Table of Contents

Figures (47)