Table of Contents
Fetching ...

Aggregation Design for Personalized Federated Multi-Modal Learning over Wireless Networks

Benshun Yin, Zhiyong Chen, Meixia Tao

TL;DR

This work tackles personalized Federated Multi-Modal Learning (FMML) over wireless networks under modality heterogeneity and non-IID data. It introduces a learning-based scheme to optimize per-device, per-modality aggregation coefficients $\xi^m_{k,k',t}$ and integrates a modality-aware parameter scheduling policy that leverages channel state to upload only a subset of parameters. Aggregation coefficients are updated via gradient-based steps that connect local losses $F_k$ to server parameters $\bm{W}^m_{k,t}$, enabling personalization without adding communication overhead. Experiments on CREMA-D and MOSEI demonstrate higher personalized accuracy and reduced training time compared with baselines such as FedAvg, FedProx, FedFomo, and FedAMP, validating the practical impact of the proposed approach.

Abstract

Federated Multi-Modal Learning (FMML) is an emerging field that integrates information from different modalities in federated learning to improve the learning performance. In this letter, we develop a parameter scheduling scheme to improve personalized performance and communication efficiency in personalized FMML, considering the non-independent and nonidentically distributed (non-IID) data along with the modality heterogeneity. Specifically, a learning-based approach is utilized to obtain the aggregation coefficients for parameters of different modalities on distinct devices. Based on the aggregation coefficients and channel state, a subset of parameters is scheduled to be uploaded to a server for each modality. Experimental results show that the proposed algorithm can effectively improve the personalized performance of FMML.

Aggregation Design for Personalized Federated Multi-Modal Learning over Wireless Networks

TL;DR

This work tackles personalized Federated Multi-Modal Learning (FMML) over wireless networks under modality heterogeneity and non-IID data. It introduces a learning-based scheme to optimize per-device, per-modality aggregation coefficients and integrates a modality-aware parameter scheduling policy that leverages channel state to upload only a subset of parameters. Aggregation coefficients are updated via gradient-based steps that connect local losses to server parameters , enabling personalization without adding communication overhead. Experiments on CREMA-D and MOSEI demonstrate higher personalized accuracy and reduced training time compared with baselines such as FedAvg, FedProx, FedFomo, and FedAMP, validating the practical impact of the proposed approach.

Abstract

Federated Multi-Modal Learning (FMML) is an emerging field that integrates information from different modalities in federated learning to improve the learning performance. In this letter, we develop a parameter scheduling scheme to improve personalized performance and communication efficiency in personalized FMML, considering the non-independent and nonidentically distributed (non-IID) data along with the modality heterogeneity. Specifically, a learning-based approach is utilized to obtain the aggregation coefficients for parameters of different modalities on distinct devices. Based on the aggregation coefficients and channel state, a subset of parameters is scheduled to be uploaded to a server for each modality. Experimental results show that the proposed algorithm can effectively improve the personalized performance of FMML.
Paper Structure (13 sections, 9 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 13 sections, 9 equations, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: A federated multi-modal learning system.
  • Figure 2: Execution process of personalized federated multi-modal learning systems with the update of aggregation coefficients.
  • Figure 3: The variation of the aggregation coefficients of (a) audio modality, (b) visual modality on CREMA-D with the non-IID-1 distribution.