FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization
Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali, AKM Mahbubur Rahman, Sajib Mistry, Aneesh Krishna
TL;DR
FedPoisonTTP exposes a realistic security gap in federated test-time personalization by modeling a grey-box adversary with limited visibility and partial participation. The authors introduce a practical attack framework that combines a history-based surrogate aggregator, a feature-regularized in-distribution poisoning scheme, and TTA-aware objectives to degrade post-aggregation performance across honest clients. Through extensive experiments on CIFAR-10-C and CIFAR-100-C with multiple FTTA methods and aggregations, they show significant degradation, especially on CIFAR-100-C, and demonstrate the transferability of grey-box attacks. The work highlights the need for robust defenses in FTTA and provides a concrete foundation for future research on secure and resilient federated personalization.
Abstract
Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and personalization in deployment. Yet, existing federated learning work largely overlooks the security risks that arise when local adaptation occurs at test time. Heterogeneous domain arrivals, diverse adaptation algorithms, and limited cross-client visibility create vulnerabilities where compromised participants can craft poisoned inputs and submit adversarial updates that undermine both global and per-client performance. To address this threat, we introduce FedPoisonTTP, a realistic grey-box attack framework that explores test-time data poisoning in the federated adaptation setting. FedPoisonTTP distills a surrogate model from adversarial queries, synthesizes in-distribution poisons using feature-consistency, and optimizes attack objectives to generate high-entropy or class-confident poisons that evade common adaptation filters. These poisons are injected during local adaptation and spread through collaborative updates, leading to broad degradation. Extensive experiments on corrupted vision benchmarks show that compromised participants can substantially diminish overall test-time performance.
