Table of Contents
Fetching ...

Delta Sum Learning: an approach for fast and global convergence in Gossip Learning

Tom Goethals, Merlijn Sebrechts, Stijn De Schrijver, Filip De Turck, Bruno Volckaert

TL;DR

This work tackles the slow global convergence of Gossip Learning by introducing Delta Sum Learning, a mathematically grounded update integration technique that aggregates base weights across nodes and their deltas with a dynamic gossip factor. Implemented within a decentralized edge framework based on Flocky and Open Application Model, it demonstrates stronger global convergence and a logarithmic rather than linear loss growth as the topology scales, achieving a 58% improvement in global accuracy retention from 10 to 50 nodes. The evaluation, conducted on MNIST with a CNN across 10–50 node topologies, shows Delta Sum Learning outperforming standard averaging and variance-corrected averaging in larger networks, while highlighting trade-offs in increased network traffic. The work also details architectural integration, experimental methodology, and avenues for future enhancements in data asymmetry handling, network efficiency, and security.

Abstract

Federated Learning is a popular approach for distributed learning due to its security and computational benefits. With the advent of powerful devices in the network edge, Gossip Learning further decentralizes Federated Learning by removing centralized integration and relying fully on peer to peer updates. However, the averaging methods generally used in both Federated and Gossip Learning are not ideal for model accuracy and global convergence. Additionally, there are few options to deploy Learning workloads in the edge as part of a larger application using a declarative approach such as Kubernetes manifests. This paper proposes Delta Sum Learning as a method to improve the basic aggregation operation in Gossip Learning, and implements it in a decentralized orchestration framework based on Open Application Model, which allows for dynamic node discovery and intent-driven deployment of multi-workload applications. Evaluation results show that Delta Sum performance is on par with alternative integration methods for 10 node topologies, but results in a 58% lower global accuracy drop when scaling to 50 nodes. Overall, it shows strong global convergence and a logarithmic loss of accuracy with increasing topology size compared to a linear loss for alternatives under limited connectivity.

Delta Sum Learning: an approach for fast and global convergence in Gossip Learning

TL;DR

This work tackles the slow global convergence of Gossip Learning by introducing Delta Sum Learning, a mathematically grounded update integration technique that aggregates base weights across nodes and their deltas with a dynamic gossip factor. Implemented within a decentralized edge framework based on Flocky and Open Application Model, it demonstrates stronger global convergence and a logarithmic rather than linear loss growth as the topology scales, achieving a 58% improvement in global accuracy retention from 10 to 50 nodes. The evaluation, conducted on MNIST with a CNN across 10–50 node topologies, shows Delta Sum Learning outperforming standard averaging and variance-corrected averaging in larger networks, while highlighting trade-offs in increased network traffic. The work also details architectural integration, experimental methodology, and avenues for future enhancements in data asymmetry handling, network efficiency, and security.

Abstract

Federated Learning is a popular approach for distributed learning due to its security and computational benefits. With the advent of powerful devices in the network edge, Gossip Learning further decentralizes Federated Learning by removing centralized integration and relying fully on peer to peer updates. However, the averaging methods generally used in both Federated and Gossip Learning are not ideal for model accuracy and global convergence. Additionally, there are few options to deploy Learning workloads in the edge as part of a larger application using a declarative approach such as Kubernetes manifests. This paper proposes Delta Sum Learning as a method to improve the basic aggregation operation in Gossip Learning, and implements it in a decentralized orchestration framework based on Open Application Model, which allows for dynamic node discovery and intent-driven deployment of multi-workload applications. Evaluation results show that Delta Sum performance is on par with alternative integration methods for 10 node topologies, but results in a 58% lower global accuracy drop when scaling to 50 nodes. Overall, it shows strong global convergence and a logarithmic loss of accuracy with increasing topology size compared to a linear loss for alternatives under limited connectivity.

Paper Structure

This paper contains 14 sections, 21 equations, 4 figures.

Figures (4)

  • Figure 1: Architecture overview of decentralized cluster discovery and Gossip Learning showing: Node discovery and metadata synchronization between nodes. ML model synchronization between workloads and a dedicated service. Model gossiping based on locally discovered nodes and hosted ML workloads.
  • Figure 2: Accuracy for Delta Sum Learning and alternative strategies for topology sizes ranging from 10 (a)) to 50 (c)). Shaded areas indicate the full range of accuracies of all nodes, except statistical outliers for visualization purposes. Comparing the different topology sizes, Delta Sum Learning results in a far lower accuracy loss as gossip topology size increases than alternative strategies.
  • Figure 3: Median accuracy of Delta Sum Learning compared to other strategies at round 235, for an increasing number of nodes in the topology. While accuracy loss for other strategies is almost linear with number of nodes, Delta Sum appears to follow a logarithmic pattern.
  • Figure 4: Network throughput in model updates per second as measured, and calculated for various theoretical scenarios such as Constant connectivity, Connectivity increase due to node density, and FedAvg under the same conditions as the evaluations.