Text Embedding Inversion Security for Multilingual Language Models
Yiyi Chen, Heather Lent, Johannes Bjerva
TL;DR
This work analyzes embedding inversion security for multilingual language models in a black-box setting, extending prior English-focused studies to examine cross-lingual and multilingual vulnerabilities. It defines multilingual and cross-lingual inversion attacks, builds a Vec2Text-style attacker framework using ME5-base and MTG data, and evaluates performance across English and four additional languages. The study reveals that multilingual models can be more vulnerable under certain conditions and that defenses designed for monolingual English defenses often fail in multilingual contexts; it also introduces a simple masking defense that preserves retrieval while substantially reducing reconstruction. The findings underscore the importance of multilingual security research and provide open-source tools to spur further defenses and evaluations across diverse languages.
Abstract
Textual data is often represented as real-numbered embeddings in NLP, particularly with the popularity of large language models (LLMs) and Embeddings as a Service (EaaS). However, storing sensitive information as embeddings can be susceptible to security breaches, as research shows that text can be reconstructed from embeddings, even without knowledge of the underlying model. While defence mechanisms have been explored, these are exclusively focused on English, leaving other languages potentially exposed to attacks. This work explores LLM security through multilingual embedding inversion. We define the problem of black-box multilingual and cross-lingual inversion attacks, and explore their potential implications. Our findings suggest that multilingual LLMs may be more vulnerable to inversion attacks, in part because English-based defences may be ineffective. To alleviate this, we propose a simple masking defense effective for both monolingual and multilingual models. This study is the first to investigate multilingual inversion attacks, shedding light on the differences in attacks and defenses across monolingual and multilingual settings.
