Table of Contents
Fetching ...

Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion Models for Enhanced Skin Disease Classification using ViT and CNN

Muhammad Ali Farooq, Wang Yao, Michael Schukat, Mark A Little, Peter Corcoran

TL;DR

This work presents Derm-T2IM, a few-shot, diffusion-driven framework for generating large-scale synthetic dermatoscopic skin lesion data to augment limited real datasets. By fine-tuning a pre-trained diffusion model with DreamBooth and LoRA on a small seed set, the authors produce diverse malignant and benign samples conditioned on text prompts, then validate the data by fine-tuning ViT and MobileNetV2 classifiers with a hybrid (synthetic+real) training regime. The approach achieves notable improvements in cross-dataset accuracy and demonstrates robust segmentation and detection on synthetic data, while open-sourcing the Derm-T2IM model and dataset to enable broader research. Overall, synthetic dermatoscopic data via stable diffusion shows promise for improving generalization, privacy-preserving sharing, and rapid production of diverse training resources for skin disease classification.

Abstract

This study explores the utilization of Dermatoscopic synthetic data generated through stable diffusion models as a strategy for enhancing the robustness of machine learning model training. Synthetic data generation plays a pivotal role in mitigating challenges associated with limited labeled datasets, thereby facilitating more effective model training. In this context, we aim to incorporate enhanced data transformation techniques by extending the recent success of few-shot learning and a small amount of data representation in text-to-image latent diffusion models. The optimally tuned model is further used for rendering high-quality skin lesion synthetic data with diverse and realistic characteristics, providing a valuable supplement and diversity to the existing training data. We investigate the impact of incorporating newly generated synthetic data into the training pipeline of state-of-art machine learning models, assessing its effectiveness in enhancing model performance and generalization to unseen real-world data. Our experimental results demonstrate the efficacy of the synthetic data generated through stable diffusion models helps in improving the robustness and adaptability of end-to-end CNN and vision transformer models on two different real-world skin lesion datasets.

Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion Models for Enhanced Skin Disease Classification using ViT and CNN

TL;DR

This work presents Derm-T2IM, a few-shot, diffusion-driven framework for generating large-scale synthetic dermatoscopic skin lesion data to augment limited real datasets. By fine-tuning a pre-trained diffusion model with DreamBooth and LoRA on a small seed set, the authors produce diverse malignant and benign samples conditioned on text prompts, then validate the data by fine-tuning ViT and MobileNetV2 classifiers with a hybrid (synthetic+real) training regime. The approach achieves notable improvements in cross-dataset accuracy and demonstrates robust segmentation and detection on synthetic data, while open-sourcing the Derm-T2IM model and dataset to enable broader research. Overall, synthetic dermatoscopic data via stable diffusion shows promise for improving generalization, privacy-preserving sharing, and rapid production of diverse training resources for skin disease classification.

Abstract

This study explores the utilization of Dermatoscopic synthetic data generated through stable diffusion models as a strategy for enhancing the robustness of machine learning model training. Synthetic data generation plays a pivotal role in mitigating challenges associated with limited labeled datasets, thereby facilitating more effective model training. In this context, we aim to incorporate enhanced data transformation techniques by extending the recent success of few-shot learning and a small amount of data representation in text-to-image latent diffusion models. The optimally tuned model is further used for rendering high-quality skin lesion synthetic data with diverse and realistic characteristics, providing a valuable supplement and diversity to the existing training data. We investigate the impact of incorporating newly generated synthetic data into the training pipeline of state-of-art machine learning models, assessing its effectiveness in enhancing model performance and generalization to unseen real-world data. Our experimental results demonstrate the efficacy of the synthetic data generated through stable diffusion models helps in improving the robustness and adaptability of end-to-end CNN and vision transformer models on two different real-world skin lesion datasets.
Paper Structure (14 sections, 3 equations, 12 figures, 4 tables)

This paper contains 14 sections, 3 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: Comprehensive block diagram representation of proposed methodology.
  • Figure 2: Skin mole hair removal data processing outputs on two different cases generated using dull razor software.
  • Figure 3: Loss graph of tuned Derm-T2IM with final loss value of 0.1394 and overall training time of 10.4 hours. The second graph shows the average amount of video RAM i.e., 5.3 GB required during the whole training process.
  • Figure 4: Learning samples with predicted random nosie while performing the fine-tuning process of Derm-T2IM: First Sample extracted at 12990 steps and last sample extracted at 311760 steps.
  • Figure 5: Rendered outputs showing newly generated benign and malignant skin mole data using Derm-T2IM. The first, third and fifth rows show the results of benign data generated via Euler, Euler a, and PLMS sampling method whereas second, fourth and six rows show the malignant lesion inference results generated via Euler, Euler a, and PLMS samplers.
  • ...and 7 more figures