Table of Contents
Fetching ...

S-SYNTH: Knowledge-Based, Synthetic Generation of Skin Images

Andrea Kim, Niloufar Saharkhiz, Elena Sizikova, Miguel Lago, Berkman Sahiner, Jana Delfino, Aldo Badano

TL;DR

Lack of diverse, well-annotated dermatology data hampers robust AI development. The authors introduce S-SYNTH, a knowledge-based skin synthesis framework that builds a detailed 3D multi-layer skin model with a growing lesion and renders physiologically plausible images under varied lighting. The work provides an open-source pipeline and demonstrates that synthetic data can augment limited real datasets, improving segmentation performance and revealing comparable trends across skin tones and lesion morphologies. This approach offers a scalable way to mitigate dataset biases and accelerate dermatology AI development, with broad implications for data augmentation and model evaluation in medical imaging.

Abstract

Development of artificial intelligence (AI) techniques in medical imaging requires access to large-scale and diverse datasets for training and evaluation. In dermatology, obtaining such datasets remains challenging due to significant variations in patient populations, illumination conditions, and acquisition system characteristics. In this work, we propose S-SYNTH, the first knowledge-based, adaptable open-source skin simulation framework to rapidly generate synthetic skin, 3D models and digitally rendered images, using an anatomically inspired multi-layer, multi-component skin and growing lesion model. The skin model allows for controlled variation in skin appearance, such as skin color, presence of hair, lesion shape, and blood fraction among other parameters. We use this framework to study the effect of possible variations on the development and evaluation of AI models for skin lesion segmentation, and show that results obtained using synthetic data follow similar comparative trends as real dermatologic images, while mitigating biases and limitations from existing datasets including small dataset size, lack of diversity, and underrepresentation.

S-SYNTH: Knowledge-Based, Synthetic Generation of Skin Images

TL;DR

Lack of diverse, well-annotated dermatology data hampers robust AI development. The authors introduce S-SYNTH, a knowledge-based skin synthesis framework that builds a detailed 3D multi-layer skin model with a growing lesion and renders physiologically plausible images under varied lighting. The work provides an open-source pipeline and demonstrates that synthetic data can augment limited real datasets, improving segmentation performance and revealing comparable trends across skin tones and lesion morphologies. This approach offers a scalable way to mitigate dataset biases and accelerate dermatology AI development, with broad implications for data augmentation and model evaluation in medical imaging.

Abstract

Development of artificial intelligence (AI) techniques in medical imaging requires access to large-scale and diverse datasets for training and evaluation. In dermatology, obtaining such datasets remains challenging due to significant variations in patient populations, illumination conditions, and acquisition system characteristics. In this work, we propose S-SYNTH, the first knowledge-based, adaptable open-source skin simulation framework to rapidly generate synthetic skin, 3D models and digitally rendered images, using an anatomically inspired multi-layer, multi-component skin and growing lesion model. The skin model allows for controlled variation in skin appearance, such as skin color, presence of hair, lesion shape, and blood fraction among other parameters. We use this framework to study the effect of possible variations on the development and evaluation of AI models for skin lesion segmentation, and show that results obtained using synthetic data follow similar comparative trends as real dermatologic images, while mitigating biases and limitations from existing datasets including small dataset size, lack of diversity, and underrepresentation.
Paper Structure (14 sections, 10 figures, 1 table)

This paper contains 14 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Overview of (a) a digital skin model generated in Houdini, and an example projection of 3 different skin lesion volumes at 5 growing time points, (b) digital rendering, with examples of generated synthetic images and their corresponding lesion masks, and (c) distribution of skin tone for real-patient images (ISIC) and synthetic images, as well as application of synthetic images for training and testing of an AI device for a skin lesion segmentation task.
  • Figure 2: Growing direction probabilities for each cell $c$ in the 3D plane. Note that values are zeros in the outwards direction, to avoid the lesion growth outside the skin.
  • Figure 3: Examples of S-SYNTH images generated with variations of (a) melanosome fraction, (b) blood fraction, (c) hair artifact, and (d) lesion shape.
  • Figure 4: Model performance changes when the training data is composed of (a) different numbers of the real images, (b) different proportions of real images replaced with synthetic images (c) different proportions of synthetic images added to real images.
  • Figure 5: Model performance when trained on real-patient datasets (ISIC, HAM) and tested on synthetic (S-SYNTH) images generated with different parameters. BF: blood fraction, MF: melanosome fraction.
  • ...and 5 more figures