FLUE: Federated Learning with Un-Encrypted model weights
Elie Atallah
TL;DR
The paper tackles privacy in federated learning by proposing FLUE, which forgoes encryption and instead employs coded local gradients and proxy-weight exchanges augmented with surplus noise to protect private data. The core idea is to encode data partitions with a singular matrix ${\bf B}$ and propagate proxies ${\bf\bar{x}}$ through a designed aggregation scheme built around matrices ${\bf A}$, ${\bar{\bar{\bf A}}}^l$, and ${\bf D}^l$; two variants (general with surpluses and a special no-surplus form) support fixed or time-varying coding schemes with replication options. Convergence guarantees are established for both static and time-varying networks, leveraging stochastic indecomposable-aperiodic (SIA) matrices and careful spectral analysis of the update operators; the convergence rate depends on the coding structure, with a reported rate of $O(\log k/\sqrt{k})$ for common step sizes. Simulation results on convex problems and neural-network-scale MNIST experiments show that FLUE can match or surpass traditional distributed gradient methods under appropriate coding-data alignment, while providing encryption-free privacy through coded updates and proxy-weight privacy. This work suggests a scalable, privacy-conscious alternative to encryption in federated settings, with avenues for stronger zero-coordination privacy and randomized encoding in future work.
Abstract
Federated Learning enables diverse devices to collaboratively train a shared model while keeping training data locally stored, avoiding the need for centralized cloud storage. Despite existing privacy measures, concerns arise from potential reverse engineering of gradients, even with added noise, revealing private data. To address this, recent research emphasizes using encrypted model parameters during training. This paper introduces a novel federated learning algorithm, leveraging coded local gradients without encryption, exchanging coded proxies for model parameters, and injecting surplus noise for enhanced privacy. Two algorithm variants are presented, showcasing convergence and learning rates adaptable to coding schemes and raw data characteristics. Two encryption-free implementations with fixed and random coding matrices are provided, demonstrating promising simulation results from both federated optimization and machine learning perspectives.
