Multimodal Federated Learning With Missing Modalities through Feature Imputation Network
Pranav Poudel, Aavash Chhetri, Prashnna Gyawali, Georgios Leontidis, Binod Bhattarai
TL;DR
This work tackles missing modalities in multimodal federated learning for healthcare by introducing a lightweight Feature Imputation Network (FIN) that operates in the representation space. FIN learns cross-modal mappings between bottleneck features from encoders for image $(I)$ and text $(T)$, enabling inference with incomplete data without sharing raw samples. Evaluations on MIMIC-CXR, NIH Open-I, and CheXpert across homogeneous and heterogeneous client configurations show that FIN outperforms naive imputations and a public-data–based generative baseline while remaining competitive with CAR-MFL, all with substantially lower communication and computation costs. The approach offers a privacy-preserving, scalable alternative for integrating multimodal information in clinical settings and points to extensions to additional modalities and architectures.
Abstract
Multimodal federated learning holds immense potential for collaboratively training models from multiple sources without sharing raw data, addressing both data scarcity and privacy concerns, two key challenges in healthcare. A major challenge in training multimodal federated models in healthcare is the presence of missing modalities due to multiple reasons, including variations in clinical practice, cost and accessibility constraints, retrospective data collection, privacy concerns, and occasional technical or human errors. Previous methods typically rely on publicly available real datasets or synthetic data to compensate for missing modalities. However, obtaining real datasets for every disease is impractical, and training generative models to synthesize missing modalities is computationally expensive and prone to errors due to the high dimensionality of medical data. In this paper, we propose a novel, lightweight, low-dimensional feature translator to reconstruct bottleneck features of the missing modalities. Our experiments on three different datasets (MIMIC-CXR, NIH Open-I, and CheXpert), in both homogeneous and heterogeneous settings consistently improve the performance of competitive baselines. The code and implementation details are available at: https://github.com/bhattarailab/FedFeatGen
