Federated Neural Architecture Search with Model-Agnostic Meta Learning
Xinyuan Huang, Jiechao Gao
TL;DR
FedMetaNAS tackles non-IID data in federated learning by integrating Model-Agnostic Meta-Learning (MAML) with Neural Architecture Search (NAS) to speed up architecture optimization and remove retraining. It employs Gumbel-Softmax reparameterization to relax the search space and soft pruning of input nodes, enabling a single-stage NAS workflow in FL. A dual-learner setup (task-specific and meta) jointly optimizes weights and architectures, with a first-order approximation to reduce cost. Empirical results on CIFAR10, CIFAR100, and MNIST show that FedMetaNAS delivers higher accuracy than FedNAS while reducing search time by over 50%, with strong robustness to non-IID distributions and favorable personalization performance.
Abstract
Federated Learning (FL) often struggles with data heterogeneity due to the naturally uneven distribution of user data across devices. Federated Neural Architecture Search (NAS) enables collaborative search for optimal model architectures tailored to heterogeneous data to achieve higher accuracy. However, this process is time-consuming due to extensive search space and retraining. To overcome this, we introduce FedMetaNAS, a framework that integrates meta-learning with NAS within the FL context to expedite the architecture search by pruning the search space and eliminating the retraining stage. Our approach first utilizes the Gumbel-Softmax reparameterization to facilitate relaxation of the mixed operations in the search space. We then refine the local search process by incorporating Model-Agnostic Meta-Learning, where a task-specific learner adapts both weights and architecture parameters (alphas) for individual tasks, while a meta learner adjusts the overall model weights and alphas based on the gradient information from task learners. Following the meta-update, we propose soft pruning using the same trick on search space to gradually sparsify the architecture, ensuring that the performance of the chosen architecture remains robust after pruning which allows for immediate use of the model without retraining. Experimental evaluations demonstrate that FedMetaNAS significantly accelerates the search process by more than 50\% with higher accuracy compared to FedNAS.
