Secure Decentralized Learning with Blockchain
Xiaoxue Zhang, Yifan Hua, Chen Qian
TL;DR
This paper tackles poisoning and incentive challenges in decentralized federated learning by introducing Blockchain-based Decentralized Federated Learning (BDFL). It combines a P2P DFL overlay with a blockchain-backed auditor committee, reputation model, and incentive mechanisms to verify updates before aggregation, thereby mitigating malicious updates while preserving privacy via differential privacy. The protocol suite covers topology maintenance, model exchange, verification, reputation management, and incentives, and leverages a near-random regular graph overlay for scalability. Experiments on MNIST and CIFAR-10 show that BDFL attains fast convergence and high accuracy even with up to $30\%$ malicious clients, thanks to reputation-driven filtering and audit-based verification.
Abstract
Federated Learning (FL) is a well-known paradigm of distributed machine learning on mobile and IoT devices, which preserves data privacy and optimizes communication efficiency. To avoid the single point of failure problem in FL, decentralized federated learning (DFL) has been proposed to use peer-to-peer communication for model aggregation, which has been considered an attractive solution for machine learning tasks on distributed personal devices. However, this process is vulnerable to attackers who share false models and data. If there exists a group of malicious clients, they might harm the performance of the model by carrying out a poisoning attack. In addition, in DFL, clients often lack the incentives to contribute their computing powers to do model training. In this paper, we proposed Blockchain-based Decentralized Federated Learning (BDFL), which leverages a blockchain for decentralized model verification and auditing. BDFL includes an auditor committee for model verification, an incentive mechanism to encourage the participation of clients, a reputation model to evaluate the trustworthiness of clients, and a protocol suite for dynamic network updates. Evaluation results show that, with the reputation mechanism, BDFL achieves fast model convergence and high accuracy on real datasets even if there exist 30\% malicious clients in the system.
