Undesirable Memorization in Large Language Models: A Survey

Ali Satvaty; Suzan Verberne; Fatih Turkmen

Undesirable Memorization in Large Language Models: A Survey

Ali Satvaty, Suzan Verberne, Fatih Turkmen

TL;DR

This survey addresses the privacy and security risks of memorization in large language models, proposing a three-dimensional taxonomy across granularity, retrievability, and desirability. It surveys measurement approaches (string matching, exposure, inference attacks, counterfactuality, and prompt compression), identifies drivers (model size, data characteristics, prompts, and decoding), and reviews mitigation strategies (data deduplication, differential privacy, unlearning, and heuristics). The authors articulate a comprehensive research agenda, highlighting future work on balancing privacy with performance, distinguishing memorization from understanding, and examining memorization in conversational agents, RAG systems, multilingual models, and diffusion-language-model contexts. The work aims to guide researchers and practitioners in auditing, defending, and responsibly deploying LLMs with respect to memorization risks.

Abstract

While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore strategies that are used so far to mitigate the undesirable aspects of this phenomenon. We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance, and the analysis of memorization in specific LLM contexts such as conversational agents, retrieval-augmented generation, and diffusion language models. Given the rapid research pace in this field, we also maintain a dedicated repository of the references discussed in this survey which will be regularly updated to reflect the latest developments.

Undesirable Memorization in Large Language Models: A Survey

TL;DR

Abstract

Paper Structure (52 sections, 1 equation, 3 figures, 3 tables)

This paper contains 52 sections, 1 equation, 3 figures, 3 tables.

Introduction
Related surveys
Data selection
Paper organization
Spectrum of Memorization
Granularity of memorization
Perfect memorization
Verbatim memorization
Approximate memorization
Entity-level memorization
Content memorization
Retrievability
Extractable memorization
Discoverable memorization
Discoverable and extractable Memorization
...and 37 more sections

Figures (3)

Figure 1: Our data selection process: Starting with 14 foundational papers as the seed, our literature search expanded through keyword search in Google Scholar (223 papers) and arXiv (111 papers), followed by manual filtering, reference chaining, and exploration of adjacent areas, concluding in a final selection of 99 papers.
Figure 2: Graphic summary of our survey
Figure 3: The spectrum of memorization could be viewed as a 3-dimensional cube.

Theorems & Definitions (10)

Definition
Definition
Definition
Definition
Definition
Definition
Definition
Definition
Remark
Definition : Exposure

Undesirable Memorization in Large Language Models: A Survey

TL;DR

Abstract

Undesirable Memorization in Large Language Models: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (10)