Data Poisoning Attacks to Locally Differentially Private Range Query Protocols
Ting-Wei Liao, Chih-Hsun Lin, Yu-Lin Tsai, Takao Murakami, Chia-Mu Yu, Jun Sakuma, Chun-Ying Huang, Hiroaki Kikuchi
TL;DR
This work investigates data poisoning attacks on locally differentially private (LDP) range query protocols, revealing that standard post-processing steps like Norm-Sub can massively amplify attacker influence. The authors develop two provably optimal attacks, AoT for tree-based AHEAD and AoG for grid-based HDG, that expertly craft fake user data to maximize the target query response while maintaining stealth. They also propose defenses and adaptive attacks to evade detection, and validate their claims through theory and extensive experiments on synthetic and real-world datasets, showing attackers can achieve 5–10x influence with a small fraction of compromised users. The findings highlight a significant vulnerability in current LDP range query protocols and underscore the need for robust defenses and redesigns that balance privacy, utility, and security in decentralized data collection.
Abstract
Local Differential Privacy (LDP) has been widely adopted to protect user privacy in decentralized data collection. However, recent studies have revealed that LDP protocols are vulnerable to data poisoning attacks, where malicious users manipulate their reported data to distort aggregated results. In this work, we present the first study on data poisoning attacks targeting LDP range query protocols, focusing on both tree-based and grid-based approaches. We identify three key challenges in executing such attacks, including crafting consistent and effective fake data, maintaining data consistency across levels or grids, and preventing server detection. To address the first two challenges, we propose novel attack methods that are provably optimal, including a tree-based attack and a grid-based attack, designed to manipulate range query results with high effectiveness. \textbf{Our key finding is that the common post-processing procedure, Norm-Sub, in LDP range query protocols can help the attacker massively amplify their attack effectiveness.} In addition, we study a potential countermeasure, but also propose an adaptive attack capable of evading this defense to address the third challenge. We evaluate our methods through theoretical analysis and extensive experiments on synthetic and real-world datasets. Our results show that the proposed attacks can significantly amplify estimations for arbitrary range queries by manipulating a small fraction of users, providing 5-10x more influence than a normal user to the estimation.
