1-Diffractor: Efficient and Utility-Preserving Text Obfuscation Leveraging Word-Level Metric Differential Privacy
Stephen Meisenbacher, Maulik Chevli, Florian Matthes
TL;DR
This work tackles privacy-preserving NLP by introducing 1-Diffractor, a fast word-level MLDP mechanism that operates in a one-dimensional embedding space. By converting embeddings to index-based lists and applying geometric noise, it achieves strong utility with substantially improved efficiency compared to prior methods, while providing formal $\varepsilon d_{\mathcal{V}}$-privacy guarantees. The authors validate utility on GLUE, privacy through plausible deniability and adversarial tests, and efficiency via speed and memory benchmarks, highlighting trade-offs between privacy strength and performance. Overall, 1-Diffractor offers a scalable, lightweight solution for private text obfuscation with practical applicability in real-world NLP pipelines.
Abstract
The study of privacy-preserving Natural Language Processing (NLP) has gained rising attention in recent years. One promising avenue studies the integration of Differential Privacy in NLP, which has brought about innovative methods in a variety of application settings. Of particular note are $\textit{word-level Metric Local Differential Privacy (MLDP)}$ mechanisms, which work to obfuscate potentially sensitive input text by performing word-by-word $\textit{perturbations}$. Although these methods have shown promising results in empirical tests, there are two major drawbacks: (1) the inevitable loss of utility due to addition of noise, and (2) the computational expensiveness of running these mechanisms on high-dimensional word embeddings. In this work, we aim to address these challenges by proposing $\texttt{1-Diffractor}$, a new mechanism that boasts high speedups in comparison to previous mechanisms, while still demonstrating strong utility- and privacy-preserving capabilities. We evaluate $\texttt{1-Diffractor}$ for utility on several NLP tasks, for theoretical and task-based privacy, and for efficiency in terms of speed and memory. $\texttt{1-Diffractor}$ shows significant improvements in efficiency, while still maintaining competitive utility and privacy scores across all conducted comparative tests against previous MLDP mechanisms. Our code is made available at: https://github.com/sjmeis/Diffractor.
