Secret-Protected Evolution for Differentially Private Synthetic Text Generation

Tianze Wang; Zhaoyu Chen; Jian Du; Yingtai Xiao; Linjun Zhang; Qiang Yan

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

Tianze Wang, Zhaoyu Chen, Jian Du, Yingtai Xiao, Linjun Zhang, Qiang Yan

TL;DR

SecPE introduces secret protection as a targeted alternative to uniform differential privacy for privacy-preserving text synthesis. By formalizing $(\mathbf{p},\mathbf{r})$-secret protection and relaxing Gaussian DP to focus on per-secret priors, SecPE achieves tighter utility-privacy trade-offs. The framework uses Secret Clustering to create noisy representatives and Protected Evolution to select high-quality samples, reducing computational complexity from $O(MN_{\mathrm{syn}})$ to $O(KN_{\mathrm{syn}})$ while preserving reconstruction guarantees. Empirically, SecPE shows lower Fréchet Inception Distance and higher downstream task accuracy than GDP-based Aug-PE across OpenReview, PubMed, and Yelp, with less noise required for the same protection level, highlighting the practicality of secret-aware privacy for synthetic text generation.

Abstract

Text data has become extremely valuable on large language models (LLMs) and even lead to general artificial intelligence (AGI). A lot of high-quality text in the real world is private and cannot be freely used due to privacy concerns. Therefore, differentially private (DP) synthetic text generation has been proposed, aiming to produce high-utility synthetic data while protecting sensitive information. However, existing DP synthetic text generation imposes uniform guarantees that often overprotect non-sensitive content, resulting in substantial utility loss and computational overhead. Therefore, we propose Secret-Protected Evolution (SecPE), a novel framework that extends private evolution with secret-aware protection. Theoretically, we show that SecPE satisfies $(\mathrm{p}, \mathrm{r})$-secret protection, constituting a relaxation of Gaussian DP that enables tighter utility-privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

TL;DR

SecPE introduces secret protection as a targeted alternative to uniform differential privacy for privacy-preserving text synthesis. By formalizing

-secret protection and relaxing Gaussian DP to focus on per-secret priors, SecPE achieves tighter utility-privacy trade-offs. The framework uses Secret Clustering to create noisy representatives and Protected Evolution to select high-quality samples, reducing computational complexity from

while preserving reconstruction guarantees. Empirically, SecPE shows lower Fréchet Inception Distance and higher downstream task accuracy than GDP-based Aug-PE across OpenReview, PubMed, and Yelp, with less noise required for the same protection level, highlighting the practicality of secret-aware privacy for synthetic text generation.

Abstract

-secret protection, constituting a relaxation of Gaussian DP that enables tighter utility-privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

TL;DR

Abstract

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (19)