pFedAFM: Adaptive Feature Mixture for Batch-Level Personalization in Heterogeneous Federated Learning
Liping Yi, Han Yu, Chao Ren, Heng Zhang, Gang Wang, Xiaoguang Liu, Xiaoxiao Li
TL;DR
pFedAFM addresses batch-level heterogeneity in model-heterogeneous personalized FL by introducing a global homogeneous feature extractor $\mathcal{G}(\theta)$ shared across clients and a locally heterogeneous model $\mathcal{F}_k(\omega_k)$. It trains iteratively to enable bidirectional knowledge transfer between global and local components, using per-dimension trainable weights to mix generalized and personalized representations for each batch, ensuring adaptive batch-level personalization. The approach achieves a non-convex convergence rate of $\mathcal{O}(1/T)$ and demonstrates up to $7.93\%$ accuracy gains over seven state-of-the-art baselines on CIFAR-10/100 under both pathological and practical non-IID settings, with lower communication and computation costs. These results suggest pFedAFM provides robust, scalable personalization in heterogeneous FL, balancing global generalization and local specialization without requiring public data.
Abstract
Model-heterogeneous personalized federated learning (MHPFL) enables FL clients to train structurally different personalized models on non-independent and identically distributed (non-IID) local data. Existing MHPFL methods focus on achieving client-level personalization, but cannot address batch-level data heterogeneity. To bridge this important gap, we propose a model-heterogeneous personalized Federated learning approach with Adaptive Feature Mixture (pFedAFM) for supervised learning tasks. It consists of three novel designs: 1) A sharing global homogeneous small feature extractor is assigned alongside each client's local heterogeneous model (consisting of a heterogeneous feature extractor and a prediction header) to facilitate cross-client knowledge fusion. The two feature extractors share the local heterogeneous model's prediction header containing rich personalized prediction knowledge to retain personalized prediction capabilities. 2) An iterative training strategy is designed to alternately train the global homogeneous small feature extractor and the local heterogeneous large model for effective global-local knowledge exchange. 3) A trainable weight vector is designed to dynamically mix the features extracted by both feature extractors to adapt to batch-level data heterogeneity. Theoretical analysis proves that pFedAFM can converge over time. Extensive experiments on 2 benchmark datasets demonstrate that it significantly outperforms 7 state-of-the-art MHPFL methods, achieving up to 7.93% accuracy improvement while incurring low communication and computation costs.
