Table of Contents
Fetching ...

Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization

Donghang Duan, Xu Zheng, Yuefeng He, Chong Mu, Leyi Cai, Lizong Zhang

TL;DR

The paper tackles the privacy-utility trade-off in text anonymization for LLMs by highlighting the privacy risks of API-based approaches and the utility collapse observed when migrating to local small-scale models. It introduces Rational Localized Adversarial Anonymization (RLAA), an training-free Attacker-Arbitrator-Anonymizer framework that enforces rational decision-making and early stopping to prevent destructive edits. Using Marginal Privacy Gain, Marginal Utility Cost, and Marginal Rate of Substitution, RLAA imposes an implicit budget to govern anonymization steps, with the arbitrator filtering low-value feedback. Experiments on PersonalReddit and reddit-self-disclosure across Llama3-8B and Qwen2.5-7B demonstrate superior privacy-utility trade-offs and Pareto dominance over strong baselines, supported by ablations and economic analyses. The work provides a practical, fully localized path to privacy-preserving NLP, while noting latency and the absence of formal DP guarantees as areas for future work.

Abstract

Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox: users must disclose data to untrusted third parties for guaranteed privacy preservation. Moreover, directly migrating current solutions to local small-scale models (LSMs) offers a suboptimal solution with severe utility collapse. Our work argues that this failure stems not merely from the capability deficits of LSMs, but significantly from the inherent irrationality of the greedy adversarial strategies employed by current state-of-the-art (SOTA) methods. To address this, we propose Rational Localized Adversarial Anonymization (RLAA), a fully localized and training-free framework featuring an Attacker-Arbitrator-Anonymizer architecture. We model the anonymization process as a trade-off between Marginal Privacy Gain (MPG) and Marginal Utility Cost (MUC), and demonstrate that greedy strategies tend to drift into an irrational state. Instead, RLAA introduces an arbitrator that acts as a rationality gatekeeper, validating the attacker's inference to filter out feedback providing negligible privacy benefits. This mechanism promotes a rational early-stopping criterion, and structurally prevents utility collapse. Extensive experiments on different benchmarks demonstrate that RLAA achieves a superior privacy-utility trade-off compared to strong baselines.

Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization

TL;DR

The paper tackles the privacy-utility trade-off in text anonymization for LLMs by highlighting the privacy risks of API-based approaches and the utility collapse observed when migrating to local small-scale models. It introduces Rational Localized Adversarial Anonymization (RLAA), an training-free Attacker-Arbitrator-Anonymizer framework that enforces rational decision-making and early stopping to prevent destructive edits. Using Marginal Privacy Gain, Marginal Utility Cost, and Marginal Rate of Substitution, RLAA imposes an implicit budget to govern anonymization steps, with the arbitrator filtering low-value feedback. Experiments on PersonalReddit and reddit-self-disclosure across Llama3-8B and Qwen2.5-7B demonstrate superior privacy-utility trade-offs and Pareto dominance over strong baselines, supported by ablations and economic analyses. The work provides a practical, fully localized path to privacy-preserving NLP, while noting latency and the absence of formal DP guarantees as areas for future work.

Abstract

Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox: users must disclose data to untrusted third parties for guaranteed privacy preservation. Moreover, directly migrating current solutions to local small-scale models (LSMs) offers a suboptimal solution with severe utility collapse. Our work argues that this failure stems not merely from the capability deficits of LSMs, but significantly from the inherent irrationality of the greedy adversarial strategies employed by current state-of-the-art (SOTA) methods. To address this, we propose Rational Localized Adversarial Anonymization (RLAA), a fully localized and training-free framework featuring an Attacker-Arbitrator-Anonymizer architecture. We model the anonymization process as a trade-off between Marginal Privacy Gain (MPG) and Marginal Utility Cost (MUC), and demonstrate that greedy strategies tend to drift into an irrational state. Instead, RLAA introduces an arbitrator that acts as a rationality gatekeeper, validating the attacker's inference to filter out feedback providing negligible privacy benefits. This mechanism promotes a rational early-stopping criterion, and structurally prevents utility collapse. Extensive experiments on different benchmarks demonstrate that RLAA achieves a superior privacy-utility trade-off compared to strong baselines.

Paper Structure

This paper contains 25 sections, 3 theorems, 19 equations, 9 figures, 6 tables, 1 algorithm.

Key Result

Proposition 1

The arbitrator's discrete mechanism creates a binary decision boundary that is functionally equivalent to an economic agent operating with an implicit budget $\lambda$.

Figures (9)

  • Figure 1: Utility Collapse of FgAA’s Naive Migration.
  • Figure 2: The RLAA Framework. Utilizing an Attacker-Arbitrator-Anonymizer architecture, the arbitrator acts as a rationality gatekeeper. It validates attacker inferences to filter out ghost leaks with negligible privacy benefits, structurally preventing utility collapse caused by irrational greedy strategies.
  • Figure 3: Privacy-Utility Trade-off. RLAA achieves superior trade-offs compared to FgAA across iterations on two datasets. The trade-off dynamics for structural metrics (ROUGE-L/BLEU) are detailed in Appendix \ref{['app:detailed_results']}.
  • Figure 4: Cumulative MRS Analysis of Llama3-8B. The figure displays Llama3-8B's cumulative MRS during the anonymization process on two datasets. FgAA (Red) shows a sustained increase, whereas RLAA (Blue) maintains a stable low MRS. The remaining results for DeepSeek-V3.2-Exp and Qwen2.5-7B are provided in Appendix \ref{['app:detailed_results']}.
  • Figure 5: Privacy-utility trade-offs via structural metrics (ROUGE-L and BLEU). Results are shown for PersonalReddit (Left) and reddit-self-disclosure (Right), demonstrating RLAA’s resistance to structural collapse.
  • ...and 4 more figures

Theorems & Definitions (8)

  • Definition 1: Marginal Privacy Gain, MPG
  • Definition 2: Marginal Utility Cost, MUC
  • Definition 3: Marginal Rate of Substitution, MRS
  • Definition 4: Economic Rationality Condition
  • Proposition 1: Implicit Budget Enforcement
  • proof : Derivation
  • Corollary 1: Hallucination Defense - Instantaneous Rationality
  • Corollary 2: Rational Equilibrium - Asymptotic Rationality