Table of Contents
Fetching ...

Diffusion models applied to skin and oral cancer classification

José J. M. Uliana, Renato A. Krohling

TL;DR

This work evaluates DiffMIC, a diffusion-model–based classifier with dual-granularity guidance, on skin cancer (PAD-UFES-20) and oral cancer (P-NDB-UFES) imaging. It demonstrates that DiffMIC achieves competitive performance compared to CNNs and Transformers, with $0.6457$ BACC on PAD-UFES-20 (6-class) and $0.8357$ (binary), and $0.9050$ BACC on P-NDB-UFES, while exposing limited cross-dataset generalization to the HIBA clinical set. The study combines global saliency priors and local ROI crops within a DDPM-inspired training framework, using a ResNet18 backbone for guidance and 150 training epochs. Overall, diffusion-based classification is viable for medical images of skin and oral lesions, though robustness across datasets remains a challenge; future work includes diffusion-driven data augmentation to boost performance of traditional CNN/Transformer models. $T=1000$ time steps and diffusion-based losses $L( heta)$ and $L_{simple}( heta)$ underpin the training, highlighting diffusion models as a competitive alternative in medical imaging classification.

Abstract

This study investigates the application of diffusion models in medical image classification (DiffMIC), focusing on skin and oral lesions. Utilizing the datasets PAD-UFES-20 for skin cancer and P-NDB-UFES for oral cancer, the diffusion model demonstrated competitive performance compared to state-of-the-art deep learning models like Convolutional Neural Networks (CNNs) and Transformers. Specifically, for the PAD-UFES-20 dataset, the model achieved a balanced accuracy of 0.6457 for six-class classification and 0.8357 for binary classification (cancer vs. non-cancer). For the P-NDB-UFES dataset, it attained a balanced accuracy of 0.9050. These results suggest that diffusion models are viable models for classifying medical images of skin and oral lesions. In addition, we investigate the robustness of the model trained on PAD-UFES-20 for skin cancer but tested on the clinical images of the HIBA dataset.

Diffusion models applied to skin and oral cancer classification

TL;DR

This work evaluates DiffMIC, a diffusion-model–based classifier with dual-granularity guidance, on skin cancer (PAD-UFES-20) and oral cancer (P-NDB-UFES) imaging. It demonstrates that DiffMIC achieves competitive performance compared to CNNs and Transformers, with BACC on PAD-UFES-20 (6-class) and (binary), and BACC on P-NDB-UFES, while exposing limited cross-dataset generalization to the HIBA clinical set. The study combines global saliency priors and local ROI crops within a DDPM-inspired training framework, using a ResNet18 backbone for guidance and 150 training epochs. Overall, diffusion-based classification is viable for medical images of skin and oral lesions, though robustness across datasets remains a challenge; future work includes diffusion-driven data augmentation to boost performance of traditional CNN/Transformer models. time steps and diffusion-based losses and underpin the training, highlighting diffusion models as a competitive alternative in medical imaging classification.

Abstract

This study investigates the application of diffusion models in medical image classification (DiffMIC), focusing on skin and oral lesions. Utilizing the datasets PAD-UFES-20 for skin cancer and P-NDB-UFES for oral cancer, the diffusion model demonstrated competitive performance compared to state-of-the-art deep learning models like Convolutional Neural Networks (CNNs) and Transformers. Specifically, for the PAD-UFES-20 dataset, the model achieved a balanced accuracy of 0.6457 for six-class classification and 0.8357 for binary classification (cancer vs. non-cancer). For the P-NDB-UFES dataset, it attained a balanced accuracy of 0.9050. These results suggest that diffusion models are viable models for classifying medical images of skin and oral lesions. In addition, we investigate the robustness of the model trained on PAD-UFES-20 for skin cancer but tested on the clinical images of the HIBA dataset.

Paper Structure

This paper contains 16 sections, 5 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: DiffMIC framework with training (a) and inference (b) pipelines. There is also a block containing the DCG model (c) pipeline. This figure was taken from the seminal work of DiffMIC yang2023diffmic.
  • Figure 2: A sample of each type of skin lesion present in PAD-UFES-20 dataset. (a) Basal Cell Carcinoma of skin. (b) Squamous Cell Carcinoma. (c) Actinic Keratosis. (d) Malignant Melanoma. (e) Melanocytic Nevus of Skin. (f) Seborrheic Keratosis PadUfes.
  • Figure 3: Samples of histopathological images from each class that exists in the P-NDB-UFES dataset DELIMA2023.
  • Figure 4: Confusion matrix obtained by evaluating DiffMIC on the PAD-UFES-20 test set.
  • Figure 5: Confusion matrix obtained by evaluating DiffMIC trained using the 6 classes of the PAD-UFES-20 and tested on the clinical images of HIBA dataset.
  • ...and 2 more figures