Table of Contents
Fetching ...

Undesirable Memorization in Large Language Models: A Survey

Ali Satvaty, Suzan Verberne, Fatih Turkmen

TL;DR

This survey addresses the privacy and security risks of memorization in large language models, proposing a three-dimensional taxonomy across granularity, retrievability, and desirability. It surveys measurement approaches (string matching, exposure, inference attacks, counterfactuality, and prompt compression), identifies drivers (model size, data characteristics, prompts, and decoding), and reviews mitigation strategies (data deduplication, differential privacy, unlearning, and heuristics). The authors articulate a comprehensive research agenda, highlighting future work on balancing privacy with performance, distinguishing memorization from understanding, and examining memorization in conversational agents, RAG systems, multilingual models, and diffusion-language-model contexts. The work aims to guide researchers and practitioners in auditing, defending, and responsibly deploying LLMs with respect to memorization risks.

Abstract

While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore strategies that are used so far to mitigate the undesirable aspects of this phenomenon. We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance, and the analysis of memorization in specific LLM contexts such as conversational agents, retrieval-augmented generation, and diffusion language models. Given the rapid research pace in this field, we also maintain a dedicated repository of the references discussed in this survey which will be regularly updated to reflect the latest developments.

Undesirable Memorization in Large Language Models: A Survey

TL;DR

This survey addresses the privacy and security risks of memorization in large language models, proposing a three-dimensional taxonomy across granularity, retrievability, and desirability. It surveys measurement approaches (string matching, exposure, inference attacks, counterfactuality, and prompt compression), identifies drivers (model size, data characteristics, prompts, and decoding), and reviews mitigation strategies (data deduplication, differential privacy, unlearning, and heuristics). The authors articulate a comprehensive research agenda, highlighting future work on balancing privacy with performance, distinguishing memorization from understanding, and examining memorization in conversational agents, RAG systems, multilingual models, and diffusion-language-model contexts. The work aims to guide researchers and practitioners in auditing, defending, and responsibly deploying LLMs with respect to memorization risks.

Abstract

While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore strategies that are used so far to mitigate the undesirable aspects of this phenomenon. We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance, and the analysis of memorization in specific LLM contexts such as conversational agents, retrieval-augmented generation, and diffusion language models. Given the rapid research pace in this field, we also maintain a dedicated repository of the references discussed in this survey which will be regularly updated to reflect the latest developments.
Paper Structure (52 sections, 1 equation, 3 figures, 3 tables)

This paper contains 52 sections, 1 equation, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Our data selection process: Starting with 14 foundational papers as the seed, our literature search expanded through keyword search in Google Scholar (223 papers) and arXiv (111 papers), followed by manual filtering, reference chaining, and exploration of adjacent areas, concluding in a final selection of 99 papers.
  • Figure 2: Graphic summary of our survey
  • Figure 3: The spectrum of memorization could be viewed as a 3-dimensional cube.

Theorems & Definitions (10)

  • Definition
  • Definition
  • Definition
  • Definition
  • Definition
  • Definition
  • Definition
  • Definition
  • Remark
  • Definition : Exposure