Exploiting Features and Logits in Heterogeneous Federated Learning
Yun-Hin Chan, Edith C. -H. Ngai
TL;DR
This work tackles federated learning when clients have heterogeneous neural architectures and privacy constraints by eliminating the need for a public dataset. It introduces Felo, which exchanges per-class mid-level features and logits to guide local updates, and Velo, which augments this with a server-side CVAE to model latent relationships and generate synthetic features. Empirical results on CIFAR-10 and CINIC-10 show that Felo and especially Velo outperform state-of-the-art heterogeneous FL baselines and can even surpass FedAvg in homogeneous settings, demonstrating robustness to data non-IID and extreme model heterogeneity. The methods offer a practical approach for edge AI in IoT, enabling effective training across diverse devices while preserving privacy, with promising directions for parallelizing server and client computations.
Abstract
Due to the rapid growth of IoT and artificial intelligence, deploying neural networks on IoT devices is becoming increasingly crucial for edge intelligence. Federated learning (FL) facilitates the management of edge devices to collaboratively train a shared model while maintaining training data local and private. However, a general assumption in FL is that all edge devices are trained on the same machine learning model, which may be impractical considering diverse device capabilities. For instance, less capable devices may slow down the updating process because they struggle to handle large models appropriate for ordinary devices. In this paper, we propose a novel data-free FL method that supports heterogeneous client models by managing features and logits, called Felo; and its extension with a conditional VAE deployed in the server, called Velo. Felo averages the mid-level features and logits from the clients at the server based on their class labels to provide the average features and logits, which are utilized for further training the client models. Unlike Felo, the server has a conditional VAE in Velo, which is used for training mid-level features and generating synthetic features according to the labels. The clients optimize their models based on the synthetic features and the average logits. We conduct experiments on two datasets and show satisfactory performances of our methods compared with the state-of-the-art methods.
