Table of Contents
Fetching ...

Evaluating LLM-based Personal Information Extraction and Countermeasures

Yupei Liu, Yuqi Jia, Jinyuan Jia, Neil Zhenqiang Gong

TL;DR

This work measures the capability of large language models to extract personal information (PIE) from public profiles and develops a benchmarking framework for attacks and defenses. It formalizes an attack framework with configurable prompt design and profile processing, and evaluates across four datasets (one synthetic, three real-world) and ten LLMs, revealing that LLM-based PIE often surpasses traditional methods. The study further introduces prompt injection as a defense that, when invisibly embedded in profiles, substantially degrades attacker extraction performance across information categories, though adaptive strategies show limited defense in some cases. Together, the findings highlight a significant security risk from LLM-based PIE and offer practical defense insights and open-science resources for ongoing research.

Abstract

Automatically extracting personal information -- such as name, phone number, and email address -- from publicly available profiles at a large scale is a stepstone to many other security attacks including spear phishing. Traditional methods -- such as regular expression, keyword search, and entity detection -- achieve limited success at such personal information extraction. In this work, we perform a systematic measurement study to benchmark large language model (LLM) based personal information extraction and countermeasures. Towards this goal, we present a framework for LLM-based extraction attacks; collect four datasets including a synthetic dataset generated by GPT-4 and three real-world datasets with manually labeled eight categories of personal information; introduce a novel mitigation strategy based on prompt injection; and systematically benchmark LLM-based attacks and countermeasures using ten LLMs and five datasets. Our key findings include: LLM can be misused by attackers to accurately extract various personal information from personal profiles; LLM outperforms traditional methods; and prompt injection can defend against strong LLM-based attacks, reducing the attack to less effective traditional ones.

Evaluating LLM-based Personal Information Extraction and Countermeasures

TL;DR

This work measures the capability of large language models to extract personal information (PIE) from public profiles and develops a benchmarking framework for attacks and defenses. It formalizes an attack framework with configurable prompt design and profile processing, and evaluates across four datasets (one synthetic, three real-world) and ten LLMs, revealing that LLM-based PIE often surpasses traditional methods. The study further introduces prompt injection as a defense that, when invisibly embedded in profiles, substantially degrades attacker extraction performance across information categories, though adaptive strategies show limited defense in some cases. Together, the findings highlight a significant security risk from LLM-based PIE and offer practical defense insights and open-science resources for ongoing research.

Abstract

Automatically extracting personal information -- such as name, phone number, and email address -- from publicly available profiles at a large scale is a stepstone to many other security attacks including spear phishing. Traditional methods -- such as regular expression, keyword search, and entity detection -- achieve limited success at such personal information extraction. In this work, we perform a systematic measurement study to benchmark large language model (LLM) based personal information extraction and countermeasures. Towards this goal, we present a framework for LLM-based extraction attacks; collect four datasets including a synthetic dataset generated by GPT-4 and three real-world datasets with manually labeled eight categories of personal information; introduce a novel mitigation strategy based on prompt injection; and systematically benchmark LLM-based attacks and countermeasures using ten LLMs and five datasets. Our key findings include: LLM can be misused by attackers to accurately extract various personal information from personal profiles; LLM outperforms traditional methods; and prompt injection can defend against strong LLM-based attacks, reducing the attack to less effective traditional ones.
Paper Structure (31 sections, 2 equations, 7 figures, 22 tables)

This paper contains 31 sections, 2 equations, 7 figures, 22 tables.

Figures (7)

  • Figure 1: LLM-based personal information extraction.
  • Figure 2: Impact of the number of in-context learning examples on LLM-based PIE.
  • Figure 3: Impact of the personal profile complexity (measured by the number of tokens) on LLM-based PIE.
  • Figure 4: Impact of different prompts to generate personal profiles in the synthetic dataset.
  • Figure 5: An example profile from the synthetic dataset after rendering. The left one has no injected prompt and the right one contains an injected prompt.
  • ...and 2 more figures