FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System
Weizhao Jin, Yuhang Yao, Shanshan Han, Jiajun Gu, Carlee Joe-Wong, Srivatsan Ravi, Salman Avestimehr, Chaoyang He
TL;DR
FedML-HE tackles privacy leakage in federated learning by enabling secure model aggregation via homomorphic encryption, while addressing the prohibitive overhead that has limited HE deployment on large foundation models. The authors introduce Selective Parameter Encryption, which encrypts only the most privacy-sensitive parameters based on a data-driven privacy map, dramatically reducing computation and communication costs. They provide a formal privacy analysis for base and selective protocols, including DP considerations, and demonstrate through extensive experiments that FedML-HE achieves up to ~10x overhead reduction for ResNet-50 and up to ~40x for BERT compared with fully encrypted baselines. The work offers a practical pathway to scalable HE-based FL deployments with encryption-key management, flexible privacy guarantees, and a modular software framework that supports runtime optimization and diverse HE backends.
Abstract
Federated Learning trains machine learning models on distributed devices by aggregating local model updates instead of local data. However, privacy concerns arise as the aggregated local models on the server may reveal sensitive personal information by inversion attacks. Privacy-preserving methods, such as homomorphic encryption (HE), then become necessary for FL training. Despite HE's privacy advantages, its applications suffer from impractical overheads, especially for foundation models. In this paper, we present FedML-HE, the first practical federated learning system with efficient HE-based secure model aggregation. FedML-HE proposes to selectively encrypt sensitive parameters, significantly reducing both computation and communication overheads during training while providing customizable privacy preservation. Our optimized system demonstrates considerable overhead reduction, particularly for large foundation models (e.g., ~10x reduction for ResNet-50, and up to ~40x reduction for BERT), demonstrating the potential for scalable HE-based FL deployment.
