Table of Contents
Fetching ...

LLM-PBE: Assessing Data Privacy in Large Language Models

Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song

TL;DR

This work identifies pervasive data privacy risks in Large Language Models and introduces LLM-PBE, a modular toolkit for systematic privacy assessment across pretraining, fine-tuning, and prompting. It encompasses four attack families (data extraction, membership inference, prompt leakage, jailbreaking) and four PETs (scrubbing, differential privacy, machine unlearning, defensive prompting), evaluated over diverse data types and model families. Empirical results reveal that larger models exhibit stronger memorization and leakage, data characteristics critically shape risk, and privacy protections incur utility costs, while some defenses show limited effectiveness against evolving attack methods. The paper provides a practical, extensible platform and actionable insights to guide privacy-preserving practices in LLM deployment and future research.

Abstract

Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis. Their profound capabilities in processing and interpreting complex language data, however, bring to light pressing concerns regarding data privacy, especially the risk of unintentional training data leakage. Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs. Addressing this gap, our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs. LLM-PBE is designed to analyze privacy across the entire lifecycle of LLMs, incorporating diverse attack and defense strategies, and handling various data types and metrics. Through detailed experimentation with multiple LLMs, LLM-PBE facilitates an in-depth exploration of data privacy concerns, shedding light on influential factors such as model size, data characteristics, and evolving temporal dimensions. This study not only enriches the understanding of privacy issues in LLMs but also serves as a vital resource for future research in the field. Aimed at enhancing the breadth of knowledge in this area, the findings, resources, and our full technical report are made available at https://llm-pbe.github.io/, providing an open platform for academic and practical advancements in LLM privacy assessment.

LLM-PBE: Assessing Data Privacy in Large Language Models

TL;DR

This work identifies pervasive data privacy risks in Large Language Models and introduces LLM-PBE, a modular toolkit for systematic privacy assessment across pretraining, fine-tuning, and prompting. It encompasses four attack families (data extraction, membership inference, prompt leakage, jailbreaking) and four PETs (scrubbing, differential privacy, machine unlearning, defensive prompting), evaluated over diverse data types and model families. Empirical results reveal that larger models exhibit stronger memorization and leakage, data characteristics critically shape risk, and privacy protections incur utility costs, while some defenses show limited effectiveness against evolving attack methods. The paper provides a practical, extensible platform and actionable insights to guide privacy-preserving practices in LLM deployment and future research.

Abstract

Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis. Their profound capabilities in processing and interpreting complex language data, however, bring to light pressing concerns regarding data privacy, especially the risk of unintentional training data leakage. Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs. Addressing this gap, our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs. LLM-PBE is designed to analyze privacy across the entire lifecycle of LLMs, incorporating diverse attack and defense strategies, and handling various data types and metrics. Through detailed experimentation with multiple LLMs, LLM-PBE facilitates an in-depth exploration of data privacy concerns, shedding light on influential factors such as model size, data characteristics, and evolving temporal dimensions. This study not only enriches the understanding of privacy issues in LLMs but also serves as a vital resource for future research in the field. Aimed at enhancing the breadth of knowledge in this area, the findings, resources, and our full technical report are made available at https://llm-pbe.github.io/, providing an open platform for academic and practical advancements in LLM privacy assessment.
Paper Structure (57 sections, 12 figures, 15 tables)

This paper contains 57 sections, 12 figures, 15 tables.

Figures (12)

  • Figure 1: An example of data leakage in LLMs.
  • Figure 2: The design of our toolkit.
  • Figure 3: A demo usage of our toolkit.
  • Figure 4: The model utility (ARC-Easy), data extraction accuracy on Enron, and data extraction accuracy on a synthetic email dataset across different Pythia model sizes.
  • Figure 5: DEA accuracy with different training tokens.
  • ...and 7 more figures