Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Abhimanyu Hans; Yuxin Wen; Neel Jain; John Kirchenbauer; Hamid Kazemi; Prajwal Singhania; Siddharth Singh; Gowthami Somepalli; Jonas Geiping; Abhinav Bhatele; Tom Goldstein

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein

TL;DR

This work runs extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrates significant reductions in extractable memorization with little to no impact on downstream benchmarks.

Abstract

Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, randomly sampled subsets of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

TL;DR

Abstract

Paper Structure (31 sections, 2 equations, 10 figures, 2 tables)

This paper contains 31 sections, 2 equations, 10 figures, 2 tables.

Introduction
Related Work
Quantifying Memorization in LLMs
Mitigating Memorization in LLMs
Regularization and Memorization
Goldfish Loss: Learning Without Memorizing
Robust Handling of Duplicate Passages with Hashing
Can Goldfish Loss Prevent Memorization?
Preventing Memorization in Extreme Scenarios
Preventing Memorization in Standard Training
Divergence Positions vs. Drop Positions
Can LLMs Swallow the Goldfish Loss? Testing Impacts on Model Performance.
Impact on Evaluation Benchmark Performance
Impact on Language Modeling Ability
Validation Loss Curves.
...and 16 more sections

Figures (10)

Figure 1: A pretrained 7B model (the control) is further trained for 100 epochs on (left) the first chapter of Harry Potter or (right) 100 wikipedia documents. We observe a drop in exact match memorization and RougeL metrics when training with goldfish loss (see Section \ref{['sec:extractable-memorization']} for metric descriptions). When prompted with the opening of Harry Potter (gray) the standard model regenerates the original text (red) while the goldfish model does not.
Figure 2: Memorization as Function of k in Goldfish Loss: We train 1B parameter models described in Section \ref{['sec:memorization-experiment-setup']} and plot histograms of RougeL scores to measure extractable memorization. Control refers to a model not trained on the 2000 repeated wikipedia documents. We observe that for lower values of k, the extractable memorization is close to the control, and that exact repetitions observed in standard loss are effectively mitigated.
Figure 3: Benchmark Performance: We pretrain 1B parameter models on 20 billion tokens as described in Section \ref{['sec:memorization-experiment-setup']} and evaluate downstream performance on various benchmarks. We note only marginal change in performance for models trained with goldfish loss ($k=3$ and $k=4$) in comparison to the model trained with standard loss. Control refers to model trained only on RedPajama and not on wikipedia canaries.
Figure 4: Number of dropped tokens and number of divergent tokens at each sequence position for a goldfish model with $k=4$.
Figure 5: Validation Loss Curves During Pretraining: We measure validation loss on the RedPajamaV2 dataset as training progresses. Left: We observe validation loss as a function of input tokens seen during training. The 4-GL model trail behind the standard loss model for the same number of input tokens. Right: However, when matching the standard loss by the count of supervised tokens—i.e., the number of unmasked tokens—either by increasing the number of steps or by expanding the batch size, we observe a similar final validation loss.
...and 5 more figures

Theorems & Definitions (1)

Remark

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

TL;DR

Abstract

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (10)

Theorems & Definitions (1)