Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification

Shanshan Han; Wenxuan Wu; Baturalp Buyukates; Weizhao Jin; Qifan Zhang; Yuhang Yao; Salman Avestimehr; Chaoyang He

Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification

Shanshan Han, Wenxuan Wu, Baturalp Buyukates, Weizhao Jin, Qifan Zhang, Yuhang Yao, Salman Avestimehr, Chaoyang He

TL;DR

This work tackles the vulnerability of Federated Learning to adversarial clients by introducing RedJasper, a two-stage anomaly detector that activates only when attacks are suspected and couples this with Zero-Knowledge Proofs to verify server-side operations. The first stage uses cross-round similarity checks to gate a second stage that measures an evilness score per local model with an L2 distance and applies a $3\sigma$-based cutoff to remove malicious updates. Key contributions include practical on-demand defense without requiring unrealistic prior knowledge, conditional activation that preserves benign accuracy, a statistically principled and verifiable removal mechanism, and cryptographic verifiability via ZKPs. Experimental results across CV and NLP tasks show robust performance under diverse attack scenarios, with ZKP verification demonstrating feasible overhead. Overall, RedJasper narrows the gap between theoretical FL security and real-world deployment by delivering trustworthy, transparent defense in heterogeneous environments.

Abstract

Federated Learning (FL) systems are susceptible to adversarial attacks, such as model poisoning attacks and backdoor attacks. Existing defense mechanisms face critical limitations in real-world deployments, such as relying on impractical assumptions (e.g., adversaries acknowledging the presence of attacks before attacking) or undermining accuracy in model training, even in benign scenarios. To address these challenges, we propose RedJasper, a two-staged anomaly detection method specifically designed for real-world FL deployments. It identifies suspicious activities in the first stage, then activates the second stage conditionally to further scrutinize the suspicious local models, employing the 3σ rule to identify real malicious local models and filtering them out from FL training. To ensure integrity and transparency within the FL system, RedJasper integrates zero-knowledge proofs, enabling clients to cryptographically verify the server's detection process without relying on the server's goodwill. RedJasper operates without unrealistic assumptions and avoids interfering with FL training in attack-free scenarios. It bridges the gap between theoretical advances in FL security and the practical demands of real-world deployment. Experimental results demonstrate that RedJasper consistently delivers performance comparable to benign cases, highlighting its effectiveness in identifying potential attacks and eliminating malicious models with high accuracy.

Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification

TL;DR

-based cutoff to remove malicious updates. Key contributions include practical on-demand defense without requiring unrealistic prior knowledge, conditional activation that preserves benign accuracy, a statistically principled and verifiable removal mechanism, and cryptographic verifiability via ZKPs. Experimental results across CV and NLP tasks show robust performance under diverse attack scenarios, with ZKP verification demonstrating feasible overhead. Overall, RedJasper narrows the gap between theoretical FL security and real-world deployment by delivering trustworthy, transparent defense in heterogeneous environments.

Abstract

Paper Structure (21 sections, 1 theorem, 1 equation, 13 figures, 2 tables, 5 algorithms)

This paper contains 21 sections, 1 theorem, 1 equation, 13 figures, 2 tables, 5 algorithms.

Introduction
Problem Setting
Adversary Model
Preliminaries
RedJasper: a Two-Stage Anomaly Detection Mechanism
Cross-Round Detection
Cross-Client Detection
Verifiable Anomaly Detection
Evaluations
Related Works
Conclusion
Details of Krum and m-Krum
Proof of the range of PPV
Proof of Theorem \ref{['def: bound']}
Effectiveness of the 3$\sigma$ Rule
...and 6 more sections

Key Result

Theorem 3.2

Let $\mathcal{L}$ be the evilness level scores for client models in the current FL round, where $\mathcal{L}$ follows normal distribution $\mathcal{N}$($\mu$, $\sigma$). The evilness level for each client $i$ is computed as $\mathcal{L}[i]=||\mathbf{w}^\tau_i-\mathbf{w}^{\tau-1}_{\text{g}}||_2$. Und

Figures (13)

Figure 1: Overview of RedJasper
Figure 2: Cosine similarities. 1 indicates likely benign models with high cosine similarity, and 2 indicates likely malicious models with low cosine similarity.
Figure 3: ZKP circuits designed for RedJasper.
Figure 4: Impacts of different parameters.
Figure 5: Byzantine attacks.
...and 8 more figures

Theorems & Definitions (4)

Definition 3.1
Theorem 3.2
proof
proof

Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification

TL;DR

Abstract

Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (4)