Towards Facilitated Fairness Assessment of AI-based Skin Lesion Classifiers Through GenAI-based Image Synthesis

Ko Watanabe; Stanislav Frolov; Aya Hassan; David Dembinsky; Adriano Lucieri; Andreas Dengel

Towards Facilitated Fairness Assessment of AI-based Skin Lesion Classifiers Through GenAI-based Image Synthesis

Ko Watanabe, Stanislav Frolov, Aya Hassan, David Dembinsky, Adriano Lucieri, Andreas Dengel

TL;DR

The paper tackles fairness auditing for AI-based skin lesion classifiers by developing a diffusion-based, attribute-controlled synthetic data generator (LightningDiT) to create demographically balanced dermoscopic cohorts. It demonstrates that synthetic cohorts reproduce bias patterns observed with real data across sex, age, and skin type while enabling controlled, privacy-preserving fairness evaluations on three pretrained melanoma classifiers. The study highlights both the promise and limitations of synthetic data for fairness testing, including potential dataset-shift effects and the need for quality control and prospective validation. Overall, the approach offers a practical workflow for systematic fairness audits in medical imaging and suggests paths to extend to multi-class diagnoses and fairness-driven model training.

Abstract

Recent advances in deep learning and on-device inference could transform routine screening for skin cancers. Along with the anticipated benefits of this technology, potential dangers arise from unforeseen and inherent biases. A significant obstacle is building evaluation datasets that accurately reflect key demographics, including sex, age, and race, as well as other underrepresented groups. To address this, we train a state-of-the-art generative model to generate synthetic data in a controllable manner to assess the fairness of publicly available skin cancer classifiers. To evaluate whether synthetic images can be used as a fairness testing dataset, we prepare a real-image dataset (MILK10K) as a benchmark and compare the True Positive Rate result of three models (DeepGuide, MelaNet, and SkinLesionDensnet). As a result, the classification tendencies observed in each model when tested on real and generated images showed similar patterns across different attribute data sets. We confirm that highly realistic synthetic images facilitate model fairness verification.

Towards Facilitated Fairness Assessment of AI-based Skin Lesion Classifiers Through GenAI-based Image Synthesis

TL;DR

Abstract

Towards Facilitated Fairness Assessment of AI-based Skin Lesion Classifiers Through GenAI-based Image Synthesis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)