DermDiff: Generative Diffusion Model for Mitigating Racial Biases in Dermatology Diagnosis
Nusrat Munia, Abdullah-Al-Zubaer Imran
TL;DR
The paper tackles racial bias in dermatology AI arising from underrepresented skin tones in public datasets. It introduces DermDiff, a latent diffusion-based framework conditioned on skin-tone and disease attributes via CLIP prompts, complemented by a skin-tone detector and a downstream ResNeXt-101 classifier, trained with real and synthetic data. Empirical results show high fidelity and diversity of generated images (as measured by $FID$ and $MS\text{-}SSIM$) and improved downstream performance and fairness metrics for darker skin tones when synthetic data are incorporated. This approach offers a scalable path to more equitable dermatology AI by augmenting imbalanced datasets with controlled synthetic imagery while preserving diagnostic utility.
Abstract
Skin diseases, such as skin cancer, are a significant public health issue, and early diagnosis is crucial for effective treatment. Artificial intelligence (AI) algorithms have the potential to assist in triaging benign vs malignant skin lesions and improve diagnostic accuracy. However, existing AI models for skin disease diagnosis are often developed and tested on limited and biased datasets, leading to poor performance on certain skin tones. To address this problem, we propose a novel generative model, named DermDiff, that can generate diverse and representative dermoscopic image data for skin disease diagnosis. Leveraging text prompting and multimodal image-text learning, DermDiff improves the representation of underrepresented groups (patients, diseases, etc.) in highly imbalanced datasets. Our extensive experimentation showcases the effectiveness of DermDiff in terms of high fidelity and diversity. Furthermore, downstream evaluation suggests the potential of DermDiff in mitigating racial biases for dermatology diagnosis. Our code is available at https://github.com/Munia03/DermDiff
