FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

Manh Duong Nguyen; Trung Thanh Nguyen; Huy Hieu Pham; Trong Nghia Hoang; Phi Le Nguyen; Thanh Trung Huynh

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

Manh Duong Nguyen, Trung Thanh Nguyen, Huy Hieu Pham, Trong Nghia Hoang, Phi Le Nguyen, Thanh Trung Huynh

TL;DR

FedMAC tackles partial-modality missing in multi-modal federated learning by introducing a client-side Missing-Aware Encoder and Cross-Modal Aggregator, supervised by contrastive regularization to learn modality-invariant representations. The method uses modality-imputation embeddings to synchronize server-client information and a reconstruction-based cross-modal mechanism to reweight and fuse available modalities, while mitigating trivial aggregation through dual contrastive losses. Empirical results on a PTB-XL subset show FedMAC outperforming baselines by up to 26% under severe missingness, across both IID and Non-IID settings and under different server-client missing statistics. The work advances practical, privacy-preserving multi-modal FL by enabling robust learning despite instance-level modality heterogeneity and incomplete data.

Abstract

Federated Learning (FL) is a method for training machine learning models using distributed data sources. It ensures privacy by allowing clients to collaboratively learn a shared global model while storing their data locally. However, a significant challenge arises when dealing with missing modalities in clients' datasets, where certain features or modalities are unavailable or incomplete, leading to heterogeneous data distribution. While previous studies have addressed the issue of complete-modality missing, they fail to tackle partial-modality missing on account of severe heterogeneity among clients at an instance level, where the pattern of missing data can vary significantly from one sample to another. To tackle this challenge, this study proposes a novel framework named FedMAC, designed to address multi-modality missing under conditions of partial-modality missing in FL. Additionally, to avoid trivial aggregation of multi-modal features, we introduce contrastive-based regularization to impose additional constraints on the latent representation space. The experimental results demonstrate the effectiveness of FedMAC across various client configurations with statistical heterogeneity, outperforming baseline methods by up to 26% in severe missing scenarios, highlighting its potential as a solution for the challenge of partially missing modalities in federated systems. Our source code is provided at https://github.com/nmduonggg/PEPSY

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

TL;DR

Abstract

Paper Structure (17 sections, 14 equations, 6 figures, 3 tables)

This paper contains 17 sections, 14 equations, 6 figures, 3 tables.

Introduction
Methodology
Problem Formulation
Overview of FedMAC
Client's Multi-modal Representation Generation
Missing-Aware Encoder
Cross-Modal Aggregator
Contrastive-based Regularization
Aggregation Operation on Server
Evaluation
Experimental Settings
Experimental Results
Similar missing statistics between the client and server
Different missing statistics among client and server
Ablation Study
...and 2 more sections

Figures (6)

Figure 1: Multi-modal settings in Federated Learning and their performance. In the ideal scenario, full modality performs best. However, complete or partial modality missing significantly decreases its performance.
Figure 2: Overview of FedMAC. We utilize a conventional weighted averaging aggregation on the server side and propose a novel model architecture and training algorithm on the client side.
Figure 3: Details of our proposed architecture for clients' local models.
Figure 4: Global fusion mechanism.
Figure 5: Example of an incomplete dataset $\hat{D}$ with a missing pattern $(p_m, p_s) = (1.0, 0.5)$ indicates that 100% of the modalities are randomly missing in 50% of the samples.
...and 1 more figures

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

TL;DR

Abstract

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

Authors

TL;DR

Abstract

Table of Contents

Figures (6)