Table of Contents
Fetching ...

Fantastyc: Blockchain-based Federated Learning Made Secure and Practical

William Boitier, Antonella Del Pozzo, Álvaro García-Pérez, Stephane Gazut, Pierre Jobic, Alexis Lemaire, Erwan Mahe, Aurelien Mayoue, Maxence Perion, Tuanir Franca Rezende, Deepika Singh, Sara Tucci-Piergiovanni

TL;DR

This work tackles secure, scalable decentralized Federated Learning by removing the central orchestrator through a blockchain-based framework. It introduces Fantastyc, which offloads validation to a set of servers and anchors off-chain computations with Proof of Availability & Integrity (PoA&I) while storing only cryptographic fingerprints on-chain, coordinated via fault-tolerant distributed storage. The approach achieves enhanced Byzantine tolerance (requiring a majority of honest servers, $2f_s+1$) and maintains practicality by decoupling ordering from data integrity/availability and by using lightweight confidentiality (InstaHide). Experimental results in geo-distributed deployments demonstrate feasible round latency, robust privacy-utility trade-offs, and scalability to large participant pools and sizable models, highlighting Fantastyc’s potential for real-world BC-based FL deployment.

Abstract

Federated Learning is a decentralized framework that enables multiple clients to collaboratively train a machine learning model under the orchestration of a central server without sharing their local data. The centrality of this framework represents a point of failure which is addressed in literature by blockchain-based federated learning approaches. While ensuring a fully-decentralized solution with traceability, such approaches still face several challenges about integrity, confidentiality and scalability to be practically deployed. In this paper, we propose Fantastyc, a solution designed to address these challenges that have been never met together in the state of the art.

Fantastyc: Blockchain-based Federated Learning Made Secure and Practical

TL;DR

This work tackles secure, scalable decentralized Federated Learning by removing the central orchestrator through a blockchain-based framework. It introduces Fantastyc, which offloads validation to a set of servers and anchors off-chain computations with Proof of Availability & Integrity (PoA&I) while storing only cryptographic fingerprints on-chain, coordinated via fault-tolerant distributed storage. The approach achieves enhanced Byzantine tolerance (requiring a majority of honest servers, ) and maintains practicality by decoupling ordering from data integrity/availability and by using lightweight confidentiality (InstaHide). Experimental results in geo-distributed deployments demonstrate feasible round latency, robust privacy-utility trade-offs, and scalability to large participant pools and sizable models, highlighting Fantastyc’s potential for real-world BC-based FL deployment.

Abstract

Federated Learning is a decentralized framework that enables multiple clients to collaboratively train a machine learning model under the orchestration of a central server without sharing their local data. The centrality of this framework represents a point of failure which is addressed in literature by blockchain-based federated learning approaches. While ensuring a fully-decentralized solution with traceability, such approaches still face several challenges about integrity, confidentiality and scalability to be practically deployed. In this paper, we propose Fantastyc, a solution designed to address these challenges that have been never met together in the state of the art.
Paper Structure (29 sections, 4 theorems, 3 equations, 5 figures, 2 tables, 2 algorithms)

This paper contains 29 sections, 4 theorems, 3 equations, 5 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

(availability and integrity) For any $({{\tt TAG}}, v)$, if exists a valid $P({{\tt TAG}},v)$, then ${{\sf locally\_valid}}({{\tt TAG}}, v)$ holds for at least one correct server, and eventually, $v$ is available from the distributed storage.

Figures (5)

  • Figure 1: Difference Federated Learning (a) and our solution (b).
  • Figure 2: Sequence diagram of the Fantastyc workflow.
  • Figure 3: Study about the impact of multi-client parallelism varying $K$ (while setting $B=5$ ; $E=10$). (a) Test set accuracy vs. communication rounds varying the number of participants (b) Number of communication rounds to reach a target accuracy of 95$\%$ depending on the number of participants per round.
  • Figure 4: Evaluation of the impact of the number of nodes with a constant number of 50 clients.
  • Figure 5: Breakdown of latency within a round with 50 clients.

Theorems & Definitions (8)

  • Lemma 1
  • Proof 1: Sketch
  • Theorem 1
  • Proof 2: Sketch
  • Lemma 2
  • Proof 3: Sketch
  • Theorem 2
  • Proof 4: Sketch