Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models

Peiran Wang; Linjie Tong; Jiaxiang Liu; Zuozhu Liu

Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models

Peiran Wang, Linjie Tong, Jiaxiang Liu, Zuozhu Liu

TL;DR

Fair-MoE addresses fairness in medical Vision-Language Models by introducing FO-MoE, which employs embedding- and feature-based Mixture-of-Experts to filter biased patch embeddings, and FOL, a fairness-oriented loss that combines distribution-distance and dispersion terms to enforce load-balanced fairness. The approach is validated on the Harvard-FairVLMed dataset, showing improvements in both accuracy (AUC) and fairness (ES-AUC, DPD, EOD) across multiple protected attributes, with parameter counts comparable to baselines. Ablation studies confirm the necessity of both FO-MoE components and all FOL terms, highlighting the benefit of dispersion-aware fairness. The work demonstrates a practical framework for fair medical VLMs and provides open-source potential to advance trustworthy clinical AI systems.

Abstract

Fairness is a fundamental principle in medical ethics. Vision Language Models (VLMs) have shown significant potential in the medical field due to their ability to leverage both visual and linguistic contexts, reducing the need for large datasets and enabling the performance of complex tasks. However, the exploration of fairness within VLM applications remains limited. Applying VLMs without a comprehensive analysis of fairness could lead to concerns about equal treatment opportunities and diminish public trust in medical deep learning models. To build trust in medical VLMs, we propose Fair-MoE, a model specifically designed to ensure both fairness and effectiveness. Fair-MoE comprises two key components: \textit{the Fairness-Oriented Mixture of Experts (FO-MoE)} and \textit{the Fairness-Oriented Loss (FOL)}. FO-MoE is designed to leverage the expertise of various specialists to filter out biased patch embeddings and use an ensemble approach to extract more equitable information relevant to specific tasks. FOL is a novel fairness-oriented loss function that not only minimizes the distances between different attributes but also optimizes the differences in the dispersion of various attributes' distributions. Extended experiments demonstrate the effectiveness and fairness of Fair-MoE. Tested on the Harvard-FairVLMed dataset, Fair-MoE showed improvements in both fairness and accuracy across all four attributes. Code will be publicly available.

Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models

TL;DR

Abstract

Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)