SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening
Murtaza Rangwala, Farag Azzedin, Richard O. Sinnott, Rajkumar Buyya
TL;DR
SketchGuard addresses the scalability bottleneck of Byzantine-robust decentralized federated learning by performing neighbor screening in the compressed sketch domain. It uses Count Sketch to generate $k$-dimensional summaries, screening neighbors with distances preserved up to a $(1+\epsilon)$-type factor, and only retrieves full models from accepted neighbors, reducing per-round communication from $O(d|N_i|)$ to $O(k|N_i| + d|S_i|)$ while maintaining theoretical convergence guarantees in both strongly convex and non-convex settings with a controlled $\gamma_{\text{eff}} = \gamma\sqrt{(1+\epsilon)/(1-\epsilon)}$. The authors prove that the compression preserves Byzantine resilience with degradation bounded by a factor $1+O(\epsilon)$ and provide convergence rates matching state-of-the-art methods. Comprehensive experiments on FEMNIST and CelebA across varied network topologies and attack models show SketchGuard achieves identical robustness to BALANCE and UBAR while reducing computation by up to $82\%$ and communication by $50$-$70\%$, with benefits that scale multiplicatively with model size and network connectivity.
Abstract
Decentralized Federated Learning (DFL) enables privacy-preserving collaborative training without centralized servers, but remains vulnerable to Byzantine attacks where malicious clients submit corrupted model updates. Existing Byzantine-robust DFL defenses rely on similarity-based neighbor screening that requires every client to exchange and compare complete high-dimensional model vectors with all neighbors in each training round, creating prohibitive communication and computational costs that prevent deployment at web scale. We propose SketchGuard, a general framework that decouples Byzantine filtering from model aggregation through sketch-based neighbor screening. SketchGuard compresses $d$-dimensional models to $k$-dimensional sketches ($k \ll d$) using Count Sketch for similarity comparisons, then selectively fetches full models only from accepted neighbors, reducing per-round communication complexity from $O(d|N_i|)$ to $O(k|N_i| + d|S_i|)$, where $|N_i|$ is the neighbor count and $|S_i| \le |N_i|$ is the accepted neighbor count. We establish rigorous convergence guarantees in both strongly convex and non-convex settings, proving that Count Sketch compression preserves Byzantine resilience with controlled degradation bounds where approximation errors introduce only a $(1+O(ε))$ factor in the effective threshold parameter. Comprehensive experiments across multiple datasets, network topologies, and attack scenarios demonstrate that SketchGuard maintains identical robustness to state-of-the-art methods while reducing computation time by up to 82% and communication overhead by 50-70% depending on filtering effectiveness, with benefits scaling multiplicatively with model dimensionality and network connectivity. These results establish the viability of sketch-based compression as a fundamental enabler of robust DFL at web scale.
