Table of Contents
Fetching ...

MFM-DA: Instance-Aware Adaptor and Hierarchical Alignment for Efficient Domain Adaptation in Medical Foundation Models

Jia-Xuan Jiang, Wenhui Lei, Yifeng Wu, Hongtao Wu, Furong Li, Yining Xie, Xiaofan Zhang, Zhong Wang

TL;DR

This work addresses domain gaps in medical foundation models under limited target-domain data by introducing MFM-DA, a two-stage, few-shot unsupervised domain adaptation framework. Stage 1 uses a DDPM trained on the source domain and adapted to the target via a Dynamic Instance-Aware Adaptor guided by a distribution-consistency loss $\mathcal{L}_{DC}$, producing target-style translations. Stage 2 fine-tunes the medical foundation model on generated target-like images using LoRA-based attention tuning and Pyramid Hierarchical Alignment, with a total loss $\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{BCE}} + \mathcal{L}_{\text{align}}$. The Dynamic Instance-Aware Adaptor combines a static domain-direction $\Delta_{ ext{static}}$ with a learnable per-batch direction $\Delta_{ ext{dynamic}}$ to form $\Delta$, while pyramid-level feature alignment across $n$ levels and low-rank adapters achieve robust cross-domain alignment, yielding strong improvements on fundus segmentation and diffusion-based generation, and demonstrating practical domain-gap mitigation for MFMs in few-shot scenarios.

Abstract

Medical Foundation Models (MFMs), trained on large-scale datasets, have demonstrated superior performance across various tasks. However, these models still struggle with domain gaps in practical applications. Specifically, even after fine-tuning on source-domain data, task-adapted foundation models often perform poorly in the target domain. To address this challenge, we propose a few-shot unsupervised domain adaptation (UDA) framework for MFMs, named MFM-DA, which only leverages a limited number of unlabeled target-domain images. Our approach begins by training a Denoising Diffusion Probabilistic Model (DDPM), which is then adapted to the target domain using a proposed dynamic instance-aware adaptor and a distribution direction loss, enabling the DDPM to translate source-domain images into the target domain style. The adapted images are subsequently processed through the MFM, where we introduce a designed channel-spatial alignment Low-Rank Adaptation (LoRA) to ensure effective feature alignment. Extensive experiments on optic cup and disc segmentation tasks demonstrate that MFM-DA outperforms state-of-the-art methods. Our work provides a practical solution to the domain gap issue in real-world MFM deployment. Code will be available at here.

MFM-DA: Instance-Aware Adaptor and Hierarchical Alignment for Efficient Domain Adaptation in Medical Foundation Models

TL;DR

This work addresses domain gaps in medical foundation models under limited target-domain data by introducing MFM-DA, a two-stage, few-shot unsupervised domain adaptation framework. Stage 1 uses a DDPM trained on the source domain and adapted to the target via a Dynamic Instance-Aware Adaptor guided by a distribution-consistency loss , producing target-style translations. Stage 2 fine-tunes the medical foundation model on generated target-like images using LoRA-based attention tuning and Pyramid Hierarchical Alignment, with a total loss . The Dynamic Instance-Aware Adaptor combines a static domain-direction with a learnable per-batch direction to form , while pyramid-level feature alignment across levels and low-rank adapters achieve robust cross-domain alignment, yielding strong improvements on fundus segmentation and diffusion-based generation, and demonstrating practical domain-gap mitigation for MFMs in few-shot scenarios.

Abstract

Medical Foundation Models (MFMs), trained on large-scale datasets, have demonstrated superior performance across various tasks. However, these models still struggle with domain gaps in practical applications. Specifically, even after fine-tuning on source-domain data, task-adapted foundation models often perform poorly in the target domain. To address this challenge, we propose a few-shot unsupervised domain adaptation (UDA) framework for MFMs, named MFM-DA, which only leverages a limited number of unlabeled target-domain images. Our approach begins by training a Denoising Diffusion Probabilistic Model (DDPM), which is then adapted to the target domain using a proposed dynamic instance-aware adaptor and a distribution direction loss, enabling the DDPM to translate source-domain images into the target domain style. The adapted images are subsequently processed through the MFM, where we introduce a designed channel-spatial alignment Low-Rank Adaptation (LoRA) to ensure effective feature alignment. Extensive experiments on optic cup and disc segmentation tasks demonstrate that MFM-DA outperforms state-of-the-art methods. Our work provides a practical solution to the domain gap issue in real-world MFM deployment. Code will be available at here.

Paper Structure

This paper contains 6 sections, 12 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Performance comparison of RETFound on the cup-to-disc segmentation task for fundus images. Despite showing superior generalization compared to UNet trained from scratch, RETFound still suffers from domain shifts, with a noticeable decline in performance when applied to the target domain.
  • Figure 2: The proposed MFM-DA framework aims to perform domain adaptation of the foundation model by enabling the DDPM to adapt to target domain images in a few-shot setting. This process generates a large number of target domain images, which are then utilized for domain adaptation in the foundation model. The framework includes an Instance-aware Adaptor, which dynamically adjusts the adaptation direction of DDPM in the feature space, and a Hierarchical Alignment loss that aligns the pyramid features of the Foundation Model.
  • Figure 3: Exemplar results of our method compared to others in the (a) generation and (b) segmentation tasks. Methods marked with “*” are used under MFMs condition.