On the Robustness of LDP Protocols for Numerical Attributes under Data Poisoning Attacks
Xiaoguang Li, Zitao Li, Ninghui Li, Wenhai Sun
TL;DR
This work addresses the vulnerability of local differential privacy (LDP) protocols for numerical attributes to data-poisoning attacks. It introduces an attack-driven robustness framework and two high-signal metrics, Absolute Shift Gain (ASG) and Shift Gain Ratio (SGR), to enable fair cross-protocol robustness comparisons across CFO-based and distribution-reconstruction mechanisms. Through extensive experiments on real and synthetic data, the authors show that CFOs in the Server setting and the SW distribution-reconstruction method offer stronger resistance to manipulation, while hash-domain size and post-processing influence security beyond traditional privacy-utility trade-offs. A zero-shot attack-detection method leveraging reconstructed distributions and a KS-test-based hypothesis framework significantly improves detection over prior work, enabling practical defense in hostile environments. The study provides concrete guidance for designing attack-resilient LDP systems and highlights avenues for future work, including robust post-processing, optimal parameter tuning (e.g., hash domain size), and shuffler-enhanced privacy-utility trade-offs.
Abstract
Recent studies reveal that local differential privacy (LDP) protocols are vulnerable to data poisoning attacks where an attacker can manipulate the final estimate on the server by leveraging the characteristics of LDP and sending carefully crafted data from a small fraction of controlled local clients. This vulnerability raises concerns regarding the robustness and reliability of LDP in hostile environments. In this paper, we conduct a systematic investigation of the robustness of state-of-the-art LDP protocols for numerical attributes, i.e., categorical frequency oracles (CFOs) with binning and consistency, and distribution reconstruction. We evaluate protocol robustness through an attack-driven approach and propose new metrics for cross-protocol attack gain measurement. The results indicate that Square Wave and CFO-based protocols in the Server setting are more robust against the attack compared to the CFO-based protocols in the User setting. Our evaluation also unfolds new relationships between LDP security and its inherent design choices. We found that the hash domain size in local-hashing-based LDP has a profound impact on protocol robustness beyond the well-known effect on utility. Further, we propose a zero-shot attack detection by leveraging the rich reconstructed distribution information. The experiment show that our detection significantly improves the existing methods and effectively identifies data manipulation in challenging scenarios.
