FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu
TL;DR
FedMef tackles memory bottlenecks in federated dynamic pruning for cross-device FL by introducing two mechanisms: budget-aware extrusion (BaE) to transfer information from pruned parameters within a budget, and scaled activation pruning (SAP) to dramatically reduce activation memory. SAP uses Normalized Sparse Convolution (NSConv) to center activations around zero and enable effective pruning with BN-free training, particularly under small batch sizes. BaE mitigates post-pruning accuracy loss by regularizing low-magnitude weights during extrusion and coupling pruning with growth, enabling a specialized sparse model that maintains accuracy while lowering memory and computational demands. Across CIFAR-10, CINIC-10, and TinyImageNet with ResNet18 and MobileNetV2, FedMef achieves higher accuracy and up to 28.5% memory savings compared to state-of-the-art federated pruning baselines, demonstrating practical impact for memory-constrained edge devices.
Abstract
Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for computation and memory resources to train deep learning models. Neural network pruning techniques, such as dynamic pruning, could enhance model efficiency, but directly adopting them in FL still poses substantial challenges, including post-pruning performance degradation, high activation memory usage, etc. To address these challenges, we propose FedMef, a novel and memory-efficient federated dynamic pruning framework. FedMef comprises two key components. First, we introduce the budget-aware extrusion that maintains pruning efficiency while preserving post-pruning performance by salvaging crucial information from parameters marked for pruning within a given budget. Second, we propose scaled activation pruning to effectively reduce activation memory footprints, which is particularly beneficial for deploying FL to memory-limited devices. Extensive experiments demonstrate the effectiveness of our proposed FedMef. In particular, it achieves a significant reduction of 28.5% in memory footprint compared to state-of-the-art methods while obtaining superior accuracy.
