Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Liangqi Yuan; Dong-Jun Han; Su Wang; Devesh Upadhyay; Christopher G. Brinton

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Liangqi Yuan, Dong-Jun Han, Su Wang, Devesh Upadhyay, Christopher G. Brinton

TL;DR

This work tackles the challenge of efficient learning in multimodal federated settings with heterogeneous client modalities and constrained communication. It introduces mmFedMC, a framework that performs decision-level fusion with modular modality models uploaded to the server and a locally retained ensemble per client, enabling personalization. Modality selection is driven by a composite priority that combines Shapley-value-based impact, modality model size, and recency, while a server-side client selection based on local loss further reduces communication without sacrificing performance. Empirical results across five real-world datasets show mmFedMC achieves comparable accuracy to baselines while delivering up to an order-of-magnitude reduction in communication, highlighting its practical value for IoT and edge deployments. The approach offers a flexible, modular solution for heterogeneous mmFL, with clear pathways for dynamic configuration and broader modality support in future work.

Abstract

Multimodal federated learning (FL) aims to enrich model training in FL settings where clients are collecting measurements across multiple modalities. However, key challenges to multimodal FL remain unaddressed, particularly in heterogeneous network settings where: (i) the set of modalities collected by each client will be diverse, and (ii) communication limitations prevent clients from uploading all their locally trained modality models to the server. In this paper, we propose multimodal Federated learning with joint Modality and Client selection (mmFedMC), a new FL methodology that can tackle the above-mentioned challenges in multimodal settings. The joint selection algorithm incorporates two main components: (a) A modality selection methodology for each client, which weighs (i) the impact of the modality, gauged by Shapley value analysis, (ii) the modality model size as a gauge of communication overhead, against (iii) the frequency of modality model updates, denoted recency, to enhance generalizability. (b) A client selection strategy for the server based on the local loss of modality model at each client. Experiments on five real-world datasets demonstrate the ability of mmFedMC to achieve comparable accuracy to several baselines while reducing the communication overhead by over 20x. A demo video of our methodology is available at https://liangqiy.com/mmfedmc/.

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

TL;DR

Abstract

Paper Structure (30 sections, 21 equations, 6 figures, 6 tables)

This paper contains 30 sections, 21 equations, 6 figures, 6 tables.

Introduction
Motivation
Overview and Contribution
Organization
Related Works
Multimodal Federated Learning
Modality Selection
Client Selection within Federated Learning
Formulation and Methodology
Federated Learning and Decision-level Fusion
Overview of Proposed mmFedMC Algorithm
Client Learning
Modality Selection
Client Selection
Modality Model Aggregation
...and 15 more sections

Figures (6)

Figure 1: Schematic representation of traditional multimodal federated learning vs. the proposed mmFedMC.
Figure 2: System diagram of the proposed mmFedMC illustrating the process of Modality Selection and Client Selection as detailed in Algorithm \ref{['Alg. mmFedMC']}.
Figure 3: Data visualization of ActionSense dataset for subject 00 engaged in peeling a potato.
Figure 4: Comparison of accuracy between proposed mmFedMC and seven baselines on the communication overhead scale.
Figure 5: The mean Shapley value (i.e., impact) of modality models throughout the mmFedMC iteration.
...and 1 more figures

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

TL;DR

Abstract

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Authors

TL;DR

Abstract

Table of Contents

Figures (6)