Table of Contents
Fetching ...

Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting

Kuniko Paxton, Zeinab Dehghani, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos

TL;DR

This work tackles fairness in skin lesion classification by reframing skin tone as a continuous distribution rather than a discrete category. It introduces a KDE-based approach on ITA-derived skin-tone representations and evaluates twelve distance metrics to quantify disparities, culminating in a Distance-based Reweighting (DRW) loss to address underrepresentation. Across CNN and transformer architectures, DRW consistently reduces the link between tone-density and accuracy, with Fidelity Similarity and related metrics proving particularly effective. The framework advances individual-level fairness in dermatology and offers generalizable methods for continuous sensitive attributes in medical imaging.

Abstract

Skin color has historically been a focal point of discrimination, yet fairness research in machine learning for medical imaging often relies on coarse subgroup categories, overlooking individual-level variations. Such group-based approaches risk obscuring biases faced by outliers within subgroups. This study introduces a distribution-based framework for evaluating and mitigating individual fairness in skin lesion classification. We treat skin tone as a continuous attribute rather than a categorical label, and employ kernel density estimation (KDE) to model its distribution. We further compare twelve statistical distance metrics to quantify disparities between skin tone distributions and propose a distance-based reweighting (DRW) loss function to correct underrepresentation in minority tones. Experiments across CNN and Transformer models demonstrate: (i) the limitations of categorical reweighting in capturing individual-level disparities, and (ii) the superior performance of distribution-based reweighting, particularly with Fidelity Similarity (FS), Wasserstein Distance (WD), Hellinger Metric (HM), and Harmonic Mean Similarity (HS). These findings establish a robust methodology for advancing fairness at individual level in dermatological AI systems, and highlight broader implications for sensitive continuous attributes in medical image analysis.

Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting

TL;DR

This work tackles fairness in skin lesion classification by reframing skin tone as a continuous distribution rather than a discrete category. It introduces a KDE-based approach on ITA-derived skin-tone representations and evaluates twelve distance metrics to quantify disparities, culminating in a Distance-based Reweighting (DRW) loss to address underrepresentation. Across CNN and transformer architectures, DRW consistently reduces the link between tone-density and accuracy, with Fidelity Similarity and related metrics proving particularly effective. The framework advances individual-level fairness in dermatology and offers generalizable methods for continuous sensitive attributes in medical imaging.

Abstract

Skin color has historically been a focal point of discrimination, yet fairness research in machine learning for medical imaging often relies on coarse subgroup categories, overlooking individual-level variations. Such group-based approaches risk obscuring biases faced by outliers within subgroups. This study introduces a distribution-based framework for evaluating and mitigating individual fairness in skin lesion classification. We treat skin tone as a continuous attribute rather than a categorical label, and employ kernel density estimation (KDE) to model its distribution. We further compare twelve statistical distance metrics to quantify disparities between skin tone distributions and propose a distance-based reweighting (DRW) loss function to correct underrepresentation in minority tones. Experiments across CNN and Transformer models demonstrate: (i) the limitations of categorical reweighting in capturing individual-level disparities, and (ii) the superior performance of distribution-based reweighting, particularly with Fidelity Similarity (FS), Wasserstein Distance (WD), Hellinger Metric (HM), and Harmonic Mean Similarity (HS). These findings establish a robust methodology for advancing fairness at individual level in dermatological AI systems, and highlight broader implications for sensitive continuous attributes in medical image analysis.

Paper Structure

This paper contains 23 sections, 9 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Fair Training Framework Procedure with Distance Metrics for Experiments: First, only skin regions are extracted from the training data, and skin color is quantified using ITA to obtain its distribution. Subsequently, the distance to the reference skin tone distribution, which is the aggregated median, is calculated, and based on the distance, inverse weighting is applied to the loss function while keeping the values continuous. This is a very simple method that extends the distribution-based reweighting technique traditionally used for categorical data to continuous data.
  • Figure 2: Performance and the Number of Data Differences in Skin Color Types 1 (Baseline)
  • Figure 3: Performance and the Number of Data Differences in Skin Color Types 1 (Categorical Reweighting Method)
  • Figure 4: Correlation between Performance and the Number of Training Data (ResNet)
  • Figure 5: Absolute Differences between Overall Max and Skin Color Types
  • ...and 4 more figures