Impact of Network Topology on Byzantine Resilience in Decentralized Federated Learning
Siddhartha Bhattacharya, Daniel Helo, Joshua Siegel
TL;DR
The paper investigates how network topology impacts Byzantine resilience in decentralized Federated Learning (DFL). It empirically evaluates two Byzantine-robust aggregations (Krum and GeoMed) on small-world and scale-free networks with 128 nodes training a CNN on MNIST, under Gaussian Byzantine attacks with random and strategic placements. Findings show that state-of-the-art robust aggregations fail to prevent degradation in large, non-fully connected networks when Byzantine nodes are strategically placed, with hubs in scale-free networks being particularly vulnerable; small-world topologies exhibit comparatively better robustness but are not immune. The work argues for topology-aware aggregation and motivates integrating network topology theory into the design of robust DFL systems for real-world deployments.
Abstract
Federated learning (FL) enables a collaborative environment for training machine learning models without sharing training data between users. This is typically achieved by aggregating model gradients on a central server. Decentralized federated learning is a rising paradigm that enables users to collaboratively train machine learning models in a peer-to-peer manner, without the need for a central aggregation server. However, before applying decentralized FL in real-world use training environments, nodes that deviate from the FL process (Byzantine nodes) must be considered when selecting an aggregation function. Recent research has focused on Byzantine-robust aggregation for client-server or fully connected networks, but has not yet evaluated such aggregation schemes for complex topologies possible with decentralized FL. Thus, the need for empirical evidence of Byzantine robustness in differing network topologies is evident. This work investigates the effects of state-of-the-art Byzantine-robust aggregation methods in complex, large-scale network structures. We find that state-of-the-art Byzantine robust aggregation strategies are not resilient within large non-fully connected networks. As such, our findings point the field towards the development of topology-aware aggregation schemes, especially necessary within the context of large scale real-world deployment.
