Table of Contents
Fetching ...

Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, Zongyuan Ge

TL;DR

This work tackles the problem of distribution shift in skin lesion classification by shifting the adaptation focus from model updates to test-time input adaptation. It introduces Diffusion-Driven Adaptation (DDA), which trains a diffusion model on source data and uses it to project target inputs back toward the source distribution during testing, paired with a structural guidance mechanism to preserve class information. A self-ensembling scheme blends predictions from original and adapted inputs to automatically balance reliance on adaptation, improving robustness across corruptions, architectures, and data regimes. Evaluations on ImageNet-C benchmarks demonstrate that DDA achieves superior robustness compared to model-based and other diffusion-based baselines, with practical advantages in portability across targets and stability under varying data regimes. The approach offers a scalable, source-free, test-time solution that can augment real-world deployment of skin-disease classifiers, albeit with computation considerations and potential biases to monitor.

Abstract

Deep learning-based diagnostic systems have demonstrated potential in skin disease diagnosis. However, their performance can easily degrade on test domains due to distribution shifts caused by input-level corruptions, such as imaging equipment variability, brightness changes, and image blur. This will reduce the reliability of model deployment in real-world scenarios. Most existing solutions focus on adapting the source model through retraining on different target domains. Although effective, this retraining process is sensitive to the amount of data and the hyperparameter configuration for optimization. In this paper, we propose a test-time image adaptation method to enhance the accuracy of the model on test data by simultaneously updating and predicting test images. We modify the target test images by projecting them back to the source domain using a diffusion model. Specifically, we design a structure guidance module that adds refinement operations through low-pass filtering during reverse sampling, regularizing the diffusion to preserve structural information. Additionally, we introduce a self-ensembling scheme automatically adjusts the reliance on adapted and unadapted inputs, enhancing adaptation robustness by rejecting inappropriate generative modeling results. To facilitate this study, we constructed the ISIC2019-C and Dermnet-C corruption robustness evaluation benchmarks. Extensive experiments on the proposed benchmarks demonstrate that our method makes the classifier more robust across various corruptions, architectures, and data regimes. Our datasets and code will be available at \url{https://github.com/minghu0830/Skin-TTA_Diffusion}.

Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

TL;DR

This work tackles the problem of distribution shift in skin lesion classification by shifting the adaptation focus from model updates to test-time input adaptation. It introduces Diffusion-Driven Adaptation (DDA), which trains a diffusion model on source data and uses it to project target inputs back toward the source distribution during testing, paired with a structural guidance mechanism to preserve class information. A self-ensembling scheme blends predictions from original and adapted inputs to automatically balance reliance on adaptation, improving robustness across corruptions, architectures, and data regimes. Evaluations on ImageNet-C benchmarks demonstrate that DDA achieves superior robustness compared to model-based and other diffusion-based baselines, with practical advantages in portability across targets and stability under varying data regimes. The approach offers a scalable, source-free, test-time solution that can augment real-world deployment of skin-disease classifiers, albeit with computation considerations and potential biases to monitor.

Abstract

Deep learning-based diagnostic systems have demonstrated potential in skin disease diagnosis. However, their performance can easily degrade on test domains due to distribution shifts caused by input-level corruptions, such as imaging equipment variability, brightness changes, and image blur. This will reduce the reliability of model deployment in real-world scenarios. Most existing solutions focus on adapting the source model through retraining on different target domains. Although effective, this retraining process is sensitive to the amount of data and the hyperparameter configuration for optimization. In this paper, we propose a test-time image adaptation method to enhance the accuracy of the model on test data by simultaneously updating and predicting test images. We modify the target test images by projecting them back to the source domain using a diffusion model. Specifically, we design a structure guidance module that adds refinement operations through low-pass filtering during reverse sampling, regularizing the diffusion to preserve structural information. Additionally, we introduce a self-ensembling scheme automatically adjusts the reliance on adapted and unadapted inputs, enhancing adaptation robustness by rejecting inappropriate generative modeling results. To facilitate this study, we constructed the ISIC2019-C and Dermnet-C corruption robustness evaluation benchmarks. Extensive experiments on the proposed benchmarks demonstrate that our method makes the classifier more robust across various corruptions, architectures, and data regimes. Our datasets and code will be available at \url{https://github.com/minghu0830/Skin-TTA_Diffusion}.
Paper Structure (49 sections, 8 equations, 16 figures, 10 tables, 1 algorithm)

This paper contains 49 sections, 8 equations, 16 figures, 10 tables, 1 algorithm.

Figures (16)

  • Figure 1: One diffusion model can adapt inputs from new and multiple targets during testing. Our adaptation method, DDA, projects inputs from all target domains to the source domain by a generative diffusion model. Having trained on the source data alone, our source diffusion model for generation and source classification model for recognition do not need any updating, and therefore scale to multiple target domains without potentially expensive and sensitive re-training optimization.
  • Figure 2: DDA projects target inputs back to the source domain. Adapting the input during testing enables direct use of the source classifier without model adaptation. The projection adds noise (forward diffusion, green arrow) then iteratively updates the input (reverse diffusion, red arrow) with conditioning on the original input (guidance, purple arrow). For reliability, we ensemble predictions with and without adaptation depending on their confidence.
  • Figure 3: DDA reliably improves robustness across corruption types. We compare DDA with the source-only model, state-of-the-art diffusion for adversarial defense (DiffPure), and a simple ablation of DDA (DDA w/o Self-Ensembling (SE)). DDA is the best on average, strictly improves on DiffPure, and improves on simple diffusion in most cases. Our self-ensembling prevents catastrophic drops (on fog or contrast, for example).
  • Figure 4: DDA is invariant to batch size and data order while Tent is extremely sensitive. To analyze sensivity to the amount and order of the data we measure the average robustness of independent adaptation across corruption types. DDA does not depend on these factors and consistently improves on MEMO. Tent fails on class-ordered data without shuffling and degrades at small batch sizes.
  • Figure 5: Visualization of updates for ablations of diffusion.
  • ...and 11 more figures