Low Resource Reconstruction Attacks Through Benign Prompts
Sol Yarkoni, Mahmood Sharif, Roi Livni
TL;DR
The paper investigates privacy and copyright risks in diffusion-based models by introducing a low-resource reconstruction attack that uses benign prompts and domain knowledge rather than training-data access. It collects a diverse set of prompts drawn from e-commerce and PoD templates, searches for near-duplicates with segmentation-guided masking and CLIP-based clique analysis, and traces sources to real-world products, revealing template memorization that extends beyond verbatim copies. Key findings include large-scale template memorization (11,400 images) across multiple models, visible copies of real humans, and persistent leakage and interpolations even in newer models (e.g., SD 3.5 and Midjourney under constrained prompts), highlighted by a user study validating perceptual copying. The work underscores practical privacy risks posed by template-based memorization and argues for rethinking data stewardship and mitigation strategies in generative systems, supported by publicly released code for replication.
Abstract
Recent advances in generative models, such as diffusion models, have raised concerns related to privacy, copyright infringement, and data stewardship. To better understand and control these risks, prior work has introduced techniques and attacks that reconstruct images, or parts of images, from training data. While these results demonstrate that training data can be recovered, existing methods often rely on high computational resources, partial access to the training set, or carefully engineered prompts. In this work, we present a new attack that requires low resources, assumes little to no access to the training data, and identifies seemingly benign prompts that can lead to potentially risky image reconstruction. We further show that such reconstructions may occur unintentionally, even for users without specialized knowledge. For example, we observe that for one existing model, the prompt ``blue Unisex T-Shirt'' generates the face of a real individual. Moreover, by combining the identified vulnerabilities with real-world prompt data, we discover prompts that reproduce memorized visual elements. Our approach builds on insights from prior work and leverages domain knowledge to expose a fundamental vulnerability arising from the use of scraped e-commerce data, where templated layouts and images are closely tied to pattern-like textual prompts. The code for our attack is publicly available at https://github.com/TheSolY/lr-tmi.
