Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

Yu-Hsiang Huang; Yuche Tsai; Hsiang Hsiao; Hong-Yi Lin; Shou-De Lin

Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

Yu-Hsiang Huang, Yuche Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-De Lin

TL;DR

The paper analyzes privacy risks of text embeddings when attackers lack direct access to the embedding model and cannot query it. It proposes a transferable embedding inversion attack that trains a surrogate encoder using a leaked dataset and employs consistency regularization and adversarial training to mimic the victim encoder, enabling text reconstruction through a surrogate. Experiments across OpenAI Ada, SBERT, ST5, and a clinical MIMIC-III dataset show that the transfer attack substantially outperforms direct inversion, recovering sensitive attributes with high accuracy. The work highlights significant privacy risks in embedding pipelines and vector databases, underscoring the need for robust defenses and data governance in embedding-based systems.

Abstract

This study investigates the privacy risks associated with text embeddings, focusing on the scenario where attackers cannot access the original embedding model. Contrary to previous research requiring direct model access, we explore a more realistic threat model by developing a transfer attack method. This approach uses a surrogate model to mimic the victim model's behavior, allowing the attacker to infer sensitive information from text embeddings without direct access. Our experiments across various embedding models and a clinical dataset demonstrate that our transfer attack significantly outperforms traditional methods, revealing the potential privacy vulnerabilities in embedding technologies and emphasizing the need for enhanced security measures.

Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

TL;DR

Abstract

Paper Structure (30 sections, 6 equations, 4 figures, 9 tables)

This paper contains 30 sections, 6 equations, 4 figures, 9 tables.

Introduction
Preliminary
Embedding inversion attack
Transferable embedding inversion attack
Methodology
Encoder Stealing with a Surrogate Model
Adversarial Threat Model Transferability
Training Pipeline
Experiment Setup
Attack Result
In-domain Text Reconstruction
Out-of-domain Text Reconstruction
Discussion
Ablation Study
Size of the Leaked Dataset
...and 15 more sections

Figures (4)

Figure 1: Model architecture of the transferable embedding inversion attack.
Figure 2: Comparison of attack performance on QNLI dataset $w.r.t.$ the amount of leaked dataset $D_L$.
Figure 3: Stealing rate of the surrogate model compared to oracle model by varying the size of $D_L$.
Figure 4: Attack performance by varying different victim and surrogate encoder. Here we use the embedding similarity metric to denote the attack performance.

Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

TL;DR

Abstract

Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

Authors

TL;DR

Abstract

Table of Contents

Figures (4)