Table of Contents
Fetching ...

MedForge: Building Medical Foundation Models Like Open Source Software Development

Zheling Tan, Kexin Ding, Jin Gao, Mu Zhou, Dimitris Metaxas, Shaoting Zhang, Dequan Wang

TL;DR

MedForge tackles the challenge of data silos and privacy in medical foundation model development by enabling asynchronous, community-driven model merging. The framework uses task-specific LoRA plugin modules and distilled datasets to integrate knowledge from multiple clinical centers without sharing raw data. Two merging strategies, MedForge-Fusion and MedForge-Mixture, provide flexible paths for incremental knowledge integration, with Mixture offering robustness by aggregating outputs rather than directly altering plugin parameters. Empirical results on BreakHis, LC25000, and MedFMC-Colon show that MedForge-Mixture achieves superior accuracy and AUC compared to single-task baselines and other collaborative baselines, demonstrating the approach’s potential for scalable, privacy-preserving, multi-task clinical AI development.

Abstract

Foundational models (FMs) have made significant strides in the healthcare domain. Yet the data silo challenge and privacy concern remain in healthcare systems, hindering safe medical data sharing and collaborative model development among institutions. The collection and curation of scalable clinical datasets increasingly become the bottleneck for training strong FMs. In this study, we propose Medical Foundation Models Merging (MedForge), a cooperative framework enabling a community-driven medical foundation model development, meanwhile preventing the information leakage of raw patient data and mitigating synchronization model development issues across clinical institutions. MedForge offers a bottom-up model construction mechanism by flexibly merging task-specific Low-Rank Adaptation (LoRA) modules, which can adapt to downstream tasks while retaining original model parameters. Through an asynchronous LoRA module integration scheme, the resulting composite model can progressively enhance its comprehensive performance on various clinical tasks. MedForge shows strong performance on multiple clinical datasets (e.g., breast cancer, lung cancer, and colon cancer) collected from different institutions. Our major findings highlight the value of collaborative foundation models in advancing multi-center clinical collaboration effectively and cohesively. Our code is publicly available at https://github.com/TanZheling/MedForge.

MedForge: Building Medical Foundation Models Like Open Source Software Development

TL;DR

MedForge tackles the challenge of data silos and privacy in medical foundation model development by enabling asynchronous, community-driven model merging. The framework uses task-specific LoRA plugin modules and distilled datasets to integrate knowledge from multiple clinical centers without sharing raw data. Two merging strategies, MedForge-Fusion and MedForge-Mixture, provide flexible paths for incremental knowledge integration, with Mixture offering robustness by aggregating outputs rather than directly altering plugin parameters. Empirical results on BreakHis, LC25000, and MedFMC-Colon show that MedForge-Mixture achieves superior accuracy and AUC compared to single-task baselines and other collaborative baselines, demonstrating the approach’s potential for scalable, privacy-preserving, multi-task clinical AI development.

Abstract

Foundational models (FMs) have made significant strides in the healthcare domain. Yet the data silo challenge and privacy concern remain in healthcare systems, hindering safe medical data sharing and collaborative model development among institutions. The collection and curation of scalable clinical datasets increasingly become the bottleneck for training strong FMs. In this study, we propose Medical Foundation Models Merging (MedForge), a cooperative framework enabling a community-driven medical foundation model development, meanwhile preventing the information leakage of raw patient data and mitigating synchronization model development issues across clinical institutions. MedForge offers a bottom-up model construction mechanism by flexibly merging task-specific Low-Rank Adaptation (LoRA) modules, which can adapt to downstream tasks while retaining original model parameters. Through an asynchronous LoRA module integration scheme, the resulting composite model can progressively enhance its comprehensive performance on various clinical tasks. MedForge shows strong performance on multiple clinical datasets (e.g., breast cancer, lung cancer, and colon cancer) collected from different institutions. Our major findings highlight the value of collaborative foundation models in advancing multi-center clinical collaboration effectively and cohesively. Our code is publicly available at https://github.com/TanZheling/MedForge.

Paper Structure

This paper contains 34 sections, 4 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: The overview of MedForge.(a) Feature branch development. Branch contributor should commit its branch plugin module and distilled data, then push them to MedForge. MedForge will merge the feature branch with the main branch. In our experiments, we adopt LoRA module as the plugin module. (b) Merging stage. Branch contributors can asynchronously commit and push their branch plugin modules and the distilled datasets to the main branch. Forge items of the main branch will be updated to equip the main branch model with new capabilities.
  • Figure 2: Main model architecture. We adopt CLIP as the base module and attach LoRA modules to the visual encoder and visual projection as the plugin module. During all the procedures, only the plugin modules are tuned while the rest are frozen. We get the classification result by comparing the cosine similarity of the visual and text embeddings.
  • Figure 3: The detailed methodology of the proposed Fusion. Branch contributors can asynchronously commit and push their branch plugin modules and the distilled datasets. the plugin modules will then be weighted fused to the current main plugin module.
  • Figure 4: The detailed methodology of the proposed Mixture. Branch contributors can asynchronously commit and push their branch plugin modules and the distilled datasets. the outputs of different plugin modules will be weighted aggregated. The weights of each merging step will be saved.
  • Figure 5: The distribution of raw dataset and distilled data.