Table of Contents
Fetching ...

FedOBD: Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning

Yuanyuan Chen, Zichen Chen, Pengcheng Wu, Han Yu

TL;DR

FedOBD targets the high communication cost of training large-scale neural networks in federated settings by introducing block-level opportunistic dropout. It decomposes networks into semantic blocks, ranks their importance with Mean Block Difference, and uploads only the most significant blocks using adaptive deterministic quantization (NNADQ) in a two-stage training process. Experiments across CIFAR-10, CIFAR-100, and IMDB show up to 88% reductions in data transmission while maintaining or improving test accuracy, outperforming state-of-the-art baselines. This approach enables scalable, communication-efficient FL for large models and has open-source implementations for broader adoption.

Abstract

Large-scale neural networks possess considerable expressive power. They are well-suited for complex learning tasks in industrial applications. However, large-scale models pose significant challenges for training under the current Federated Learning (FL) paradigm. Existing approaches for efficient FL training often leverage model parameter dropout. However, manipulating individual model parameters is not only inefficient in meaningfully reducing the communication overhead when training large-scale FL models, but may also be detrimental to the scaling efforts and model performance as shown by recent research. To address these issues, we propose the Federated Opportunistic Block Dropout (FedOBD) approach. The key novelty is that it decomposes large-scale models into semantic blocks so that FL participants can opportunistically upload quantized blocks, which are deemed to be significant towards training the model, to the FL server for aggregation. Extensive experiments evaluating FedOBD against four state-of-the-art approaches based on multiple real-world datasets show that it reduces the overall communication overhead by more than 88% compared to the best performing baseline approach, while achieving the highest test accuracy. To the best of our knowledge, FedOBD is the first approach to perform dropout on FL models at the block level rather than at the individual parameter level.

FedOBD: Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning

TL;DR

FedOBD targets the high communication cost of training large-scale neural networks in federated settings by introducing block-level opportunistic dropout. It decomposes networks into semantic blocks, ranks their importance with Mean Block Difference, and uploads only the most significant blocks using adaptive deterministic quantization (NNADQ) in a two-stage training process. Experiments across CIFAR-10, CIFAR-100, and IMDB show up to 88% reductions in data transmission while maintaining or improving test accuracy, outperforming state-of-the-art baselines. This approach enables scalable, communication-efficient FL for large models and has open-source implementations for broader adoption.

Abstract

Large-scale neural networks possess considerable expressive power. They are well-suited for complex learning tasks in industrial applications. However, large-scale models pose significant challenges for training under the current Federated Learning (FL) paradigm. Existing approaches for efficient FL training often leverage model parameter dropout. However, manipulating individual model parameters is not only inefficient in meaningfully reducing the communication overhead when training large-scale FL models, but may also be detrimental to the scaling efforts and model performance as shown by recent research. To address these issues, we propose the Federated Opportunistic Block Dropout (FedOBD) approach. The key novelty is that it decomposes large-scale models into semantic blocks so that FL participants can opportunistically upload quantized blocks, which are deemed to be significant towards training the model, to the FL server for aggregation. Extensive experiments evaluating FedOBD against four state-of-the-art approaches based on multiple real-world datasets show that it reduces the overall communication overhead by more than 88% compared to the best performing baseline approach, while achieving the highest test accuracy. To the best of our knowledge, FedOBD is the first approach to perform dropout on FL models at the block level rather than at the individual parameter level.
Paper Structure (19 sections, 9 equations, 1 figure, 2 tables, 4 algorithms)