Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization
Donghang Duan, Xu Zheng, Yuefeng He, Chong Mu, Leyi Cai, Lizong Zhang
TL;DR
The paper tackles the privacy-utility trade-off in text anonymization for LLMs by highlighting the privacy risks of API-based approaches and the utility collapse observed when migrating to local small-scale models. It introduces Rational Localized Adversarial Anonymization (RLAA), an training-free Attacker-Arbitrator-Anonymizer framework that enforces rational decision-making and early stopping to prevent destructive edits. Using Marginal Privacy Gain, Marginal Utility Cost, and Marginal Rate of Substitution, RLAA imposes an implicit budget to govern anonymization steps, with the arbitrator filtering low-value feedback. Experiments on PersonalReddit and reddit-self-disclosure across Llama3-8B and Qwen2.5-7B demonstrate superior privacy-utility trade-offs and Pareto dominance over strong baselines, supported by ablations and economic analyses. The work provides a practical, fully localized path to privacy-preserving NLP, while noting latency and the absence of formal DP guarantees as areas for future work.
Abstract
Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox: users must disclose data to untrusted third parties for guaranteed privacy preservation. Moreover, directly migrating current solutions to local small-scale models (LSMs) offers a suboptimal solution with severe utility collapse. Our work argues that this failure stems not merely from the capability deficits of LSMs, but significantly from the inherent irrationality of the greedy adversarial strategies employed by current state-of-the-art (SOTA) methods. To address this, we propose Rational Localized Adversarial Anonymization (RLAA), a fully localized and training-free framework featuring an Attacker-Arbitrator-Anonymizer architecture. We model the anonymization process as a trade-off between Marginal Privacy Gain (MPG) and Marginal Utility Cost (MUC), and demonstrate that greedy strategies tend to drift into an irrational state. Instead, RLAA introduces an arbitrator that acts as a rationality gatekeeper, validating the attacker's inference to filter out feedback providing negligible privacy benefits. This mechanism promotes a rational early-stopping criterion, and structurally prevents utility collapse. Extensive experiments on different benchmarks demonstrate that RLAA achieves a superior privacy-utility trade-off compared to strong baselines.
