MultiFair: Multimodal Balanced Fairness-Aware Medical Classification with Dual-Level Gradient Modulation
Md Zubair, Hao Zheng, Nussdorf Jonathan, Grayson W. Armstrong, Lucy Q. Shen, Gabriela Wilson, Yu Tian, Xingquan Zhu, Min Shi
TL;DR
MultiFair tackles modality learning bias and demographic fairness bias in multimodal medical classification by introducing dual-level gradient modulation that adjusts training signals at both modality and group levels. It combines a multi-head attention fusion of modality encoders with modality balancing and gradient-direction alignment, complemented by a fairness-aware modulation using differentiable surrogate AUC across demographic groups and an overall fairness loss in the total objective. The approach is theoretically analyzed under smoothness assumptions and empirically validated on two glaucoma-focused multimodal datasets (FairVision and FairCLIP), where it achieves higher AUC and ES-AUC than unimodal, fairness-aware, and balanced baselines while reducing subgroup disparities. Overall, MultiFair provides a unified, generalizable framework for balanced, fairness-aware multimodal learning in high-stakes medical settings and beyond, with potential extensions to incomplete modality scenarios and multi-class problems.
Abstract
Medical decision systems increasingly rely on data from multiple sources to ensure reliable and unbiased diagnosis. However, existing multimodal learning models fail to achieve this goal because they often ignore two critical challenges. First, various data modalities may learn unevenly, thereby converging to a model biased towards certain modalities. Second, the model may emphasize learning on certain demographic groups causing unfair performances. The two aspects can influence each other, as different data modalities may favor respective groups during optimization, leading to both imbalanced and unfair multimodal learning. This paper proposes a novel approach called MultiFair for multimodal medical classification, which addresses these challenges with a dual-level gradient modulation process. MultiFair dynamically modulates training gradients regarding the optimization direction and magnitude at both data modality and group levels. We conduct extensive experiments on two multimodal medical datasets with different demographic groups. The results show that MultiFair outperforms state-of-the-art multimodal learning and fairness learning methods.
