Aggregation Design for Personalized Federated Multi-Modal Learning over Wireless Networks
Benshun Yin, Zhiyong Chen, Meixia Tao
TL;DR
This work tackles personalized Federated Multi-Modal Learning (FMML) over wireless networks under modality heterogeneity and non-IID data. It introduces a learning-based scheme to optimize per-device, per-modality aggregation coefficients $\xi^m_{k,k',t}$ and integrates a modality-aware parameter scheduling policy that leverages channel state to upload only a subset of parameters. Aggregation coefficients are updated via gradient-based steps that connect local losses $F_k$ to server parameters $\bm{W}^m_{k,t}$, enabling personalization without adding communication overhead. Experiments on CREMA-D and MOSEI demonstrate higher personalized accuracy and reduced training time compared with baselines such as FedAvg, FedProx, FedFomo, and FedAMP, validating the practical impact of the proposed approach.
Abstract
Federated Multi-Modal Learning (FMML) is an emerging field that integrates information from different modalities in federated learning to improve the learning performance. In this letter, we develop a parameter scheduling scheme to improve personalized performance and communication efficiency in personalized FMML, considering the non-independent and nonidentically distributed (non-IID) data along with the modality heterogeneity. Specifically, a learning-based approach is utilized to obtain the aggregation coefficients for parameters of different modalities on distinct devices. Based on the aggregation coefficients and channel state, a subset of parameters is scheduled to be uploaded to a server for each modality. Experimental results show that the proposed algorithm can effectively improve the personalized performance of FMML.
