Table of Contents
Fetching ...

Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model

Yulong Shi, Shijie Li, Ziyi Li, Lin Qi

TL;DR

Tell2Adapt is introduced, a novel SFUDA framework that harnesses the vast, generalizable knowledge of the Vision Foundation Model (VFM), and consistently outperforms existing approaches, achieving SOTA for a unified SFUDA framework in medical image segmentation.

Abstract

Source Free Unsupervised Domain Adaptation (SFUDA) is critical for deploying deep learning models across diverse clinical settings. However, existing methods are typically designed for low-gap, specific domain shifts and cannot generalize into a unified, multi-modalities, and multi-target framework, which presents a major barrier to real-world application. To overcome this issue, we introduce Tell2Adapt, a novel SFUDA framework that harnesses the vast, generalizable knowledge of the Vision Foundation Model (VFM). Our approach ensures high-fidelity VFM prompts through Context-Aware Prompts Regularization (CAPR), which robustly translates varied text prompts into canonical instructions. This enables the generation of high-quality pseudo-labels for efficiently adapting the lightweight student model to target domain. To guarantee clinical reliability, the framework incorporates Visual Plausibility Refinement (VPR), which leverages the VFM's anatomical knowledge to re-ground the adapted model's predictions in target image's low-level visual features, effectively removing noise and false positives. We conduct one of the most extensive SFUDA evaluations to date, validating our framework across 10 domain adaptation directions and 22 anatomical targets, including brain, cardiac, polyp, and abdominal targets. Our results demonstrate that Tell2Adapt consistently outperforms existing approaches, achieving SOTA for a unified SFUDA framework in medical image segmentation. Code are avaliable at https://github.com/derekshiii/Tell2Adapt.

Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model

TL;DR

Tell2Adapt is introduced, a novel SFUDA framework that harnesses the vast, generalizable knowledge of the Vision Foundation Model (VFM), and consistently outperforms existing approaches, achieving SOTA for a unified SFUDA framework in medical image segmentation.

Abstract

Source Free Unsupervised Domain Adaptation (SFUDA) is critical for deploying deep learning models across diverse clinical settings. However, existing methods are typically designed for low-gap, specific domain shifts and cannot generalize into a unified, multi-modalities, and multi-target framework, which presents a major barrier to real-world application. To overcome this issue, we introduce Tell2Adapt, a novel SFUDA framework that harnesses the vast, generalizable knowledge of the Vision Foundation Model (VFM). Our approach ensures high-fidelity VFM prompts through Context-Aware Prompts Regularization (CAPR), which robustly translates varied text prompts into canonical instructions. This enables the generation of high-quality pseudo-labels for efficiently adapting the lightweight student model to target domain. To guarantee clinical reliability, the framework incorporates Visual Plausibility Refinement (VPR), which leverages the VFM's anatomical knowledge to re-ground the adapted model's predictions in target image's low-level visual features, effectively removing noise and false positives. We conduct one of the most extensive SFUDA evaluations to date, validating our framework across 10 domain adaptation directions and 22 anatomical targets, including brain, cardiac, polyp, and abdominal targets. Our results demonstrate that Tell2Adapt consistently outperforms existing approaches, achieving SOTA for a unified SFUDA framework in medical image segmentation. Code are avaliable at https://github.com/derekshiii/Tell2Adapt.
Paper Structure (23 sections, 7 equations, 4 figures, 12 tables)

This paper contains 23 sections, 7 equations, 4 figures, 12 tables.

Figures (4)

  • Figure 1: Distribution shift across imaging modalities in BraTS. Kolmogorov-Smirnov statistics quantify the divergence between intensity distributions of the four different imaging modalities.
  • Figure 2: Generalization performance of Tell2Adapt across diverse adaptation tasks. The outermost ring displays the adaptation direction, the middle ring shows Dice score (DICE), while the innermost ring represents the Average Surface Distance (ASD).
  • Figure 3: Overview of Tell2Adapt, a unified SFUDA framework. The workflow decouples pseudo-label generation from the source model by leveraging VFM guided by text prompts. We introduce CAPR to standardize these prompts, enabling the VFM to generate high-fidelity pseudo-labels. This knowledge is then distilled into the lightweight source model via self-training and refined using VPR.
  • Figure 4: Qualitative results of Tell2Adapt demonstrate its strong generalization capability across diverse and challenging SFUDA directions. The figure presents segmentation outputs from Tell2Adapt adapted to various target domains. Specifically, the first and second rows show MR-CT adaptation for abdominal targets in AMOS, while the third row illustrates extreme-gap adaptation in cardiac targets (MR-US), single target adaptation in polyp (Kvasir-CVCDB), and multi-sequence adaptation in brain targets (T1n-T2w and T1c-T2f).