Exfiltration of personal information from ChatGPT via prompt injection

Gregory Schwartzman

Exfiltration of personal information from ChatGPT via prompt injection

Gregory Schwartzman

TL;DR

Prompt injection enables exfiltration of users' personal information from ChatGPT 4/4o through malicious prompts that instruct URL access and memory-based leakage. The paper presents two attacks: a naive query-based leakage and a more robust method using URL-prefix chains to transmit data, including a memory-enabled technique across sessions. It analyzes defense mechanisms and demonstrates bypasses, outlining mitigations such as restricting URL access, memory management, and user education, and discusses responsible disclosure. The findings underscore privacy risks in consumer LLMs and suggest practical safeguards to limit data leakage in real-world deployments.

Abstract

We report that ChatGPT 4 and 4o are susceptible to a prompt injection attack that allows an attacker to exfiltrate users' personal data. It is applicable without the use of any 3rd party tools and all users are currently affected. This vulnerability is exacerbated by the recent introduction of ChatGPT's memory feature, which allows an attacker to command ChatGPT to monitor the user for the desired personal data.

Exfiltration of personal information from ChatGPT via prompt injection

TL;DR

Abstract

Exfiltration of personal information from ChatGPT via prompt injection

Authors

TL;DR

Abstract

Table of Contents