SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

Murtaza Rangwala; Farag Azzedin; Richard O. Sinnott; Rajkumar Buyya

SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

Murtaza Rangwala, Farag Azzedin, Richard O. Sinnott, Rajkumar Buyya

TL;DR

SketchGuard addresses the scalability bottleneck of Byzantine-robust decentralized federated learning by performing neighbor screening in the compressed sketch domain. It uses Count Sketch to generate $k$-dimensional summaries, screening neighbors with distances preserved up to a $(1+\epsilon)$-type factor, and only retrieves full models from accepted neighbors, reducing per-round communication from $O(d|N_i|)$ to $O(k|N_i| + d|S_i|)$ while maintaining theoretical convergence guarantees in both strongly convex and non-convex settings with a controlled $\gamma_{\text{eff}} = \gamma\sqrt{(1+\epsilon)/(1-\epsilon)}$. The authors prove that the compression preserves Byzantine resilience with degradation bounded by a factor $1+O(\epsilon)$ and provide convergence rates matching state-of-the-art methods. Comprehensive experiments on FEMNIST and CelebA across varied network topologies and attack models show SketchGuard achieves identical robustness to BALANCE and UBAR while reducing computation by up to $82\%$ and communication by $50$-$70\%$, with benefits that scale multiplicatively with model size and network connectivity.

Abstract

Decentralized Federated Learning (DFL) enables privacy-preserving collaborative training without centralized servers, but remains vulnerable to Byzantine attacks where malicious clients submit corrupted model updates. Existing Byzantine-robust DFL defenses rely on similarity-based neighbor screening that requires every client to exchange and compare complete high-dimensional model vectors with all neighbors in each training round, creating prohibitive communication and computational costs that prevent deployment at web scale. We propose SketchGuard, a general framework that decouples Byzantine filtering from model aggregation through sketch-based neighbor screening. SketchGuard compresses $d$-dimensional models to $k$-dimensional sketches ($k \ll d$) using Count Sketch for similarity comparisons, then selectively fetches full models only from accepted neighbors, reducing per-round communication complexity from $O(d|N_i|)$ to $O(k|N_i| + d|S_i|)$, where $|N_i|$ is the neighbor count and $|S_i| \le |N_i|$ is the accepted neighbor count. We establish rigorous convergence guarantees in both strongly convex and non-convex settings, proving that Count Sketch compression preserves Byzantine resilience with controlled degradation bounds where approximation errors introduce only a $(1+O(ε))$ factor in the effective threshold parameter. Comprehensive experiments across multiple datasets, network topologies, and attack scenarios demonstrate that SketchGuard maintains identical robustness to state-of-the-art methods while reducing computation time by up to 82% and communication overhead by 50-70% depending on filtering effectiveness, with benefits scaling multiplicatively with model dimensionality and network connectivity. These results establish the viability of sketch-based compression as a fundamental enabler of robust DFL at web scale.

SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

TL;DR

-dimensional summaries, screening neighbors with distances preserved up to a

-type factor, and only retrieves full models from accepted neighbors, reducing per-round communication from

while maintaining theoretical convergence guarantees in both strongly convex and non-convex settings with a controlled

. The authors prove that the compression preserves Byzantine resilience with degradation bounded by a factor

and provide convergence rates matching state-of-the-art methods. Comprehensive experiments on FEMNIST and CelebA across varied network topologies and attack models show SketchGuard achieves identical robustness to BALANCE and UBAR while reducing computation by up to

and communication by

, with benefits that scale multiplicatively with model size and network connectivity.

Abstract

-dimensional models to

-dimensional sketches (

) using Count Sketch for similarity comparisons, then selectively fetches full models only from accepted neighbors, reducing per-round communication complexity from

, where

is the neighbor count and

is the accepted neighbor count. We establish rigorous convergence guarantees in both strongly convex and non-convex settings, proving that Count Sketch compression preserves Byzantine resilience with controlled degradation bounds where approximation errors introduce only a

factor in the effective threshold parameter. Comprehensive experiments across multiple datasets, network topologies, and attack scenarios demonstrate that SketchGuard maintains identical robustness to state-of-the-art methods while reducing computation time by up to 82% and communication overhead by 50-70% depending on filtering effectiveness, with benefits scaling multiplicatively with model dimensionality and network connectivity. These results establish the viability of sketch-based compression as a fundamental enabler of robust DFL at web scale.

SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

TL;DR

Abstract

SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (5)