Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

Jayanth Mohan; Arrun Sivasubramanian; V Sowmya; Ravi Vinayakumar

Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

Jayanth Mohan, Arrun Sivasubramanian, V Sowmya, Ravi Vinayakumar

TL;DR

The paper tackles automated skin disease classification across 31 classes, aiming to improve accuracy and interpretability in dermatology using deep learning. It systematically compares Vision Transformers (ViT), Swin Transformers, and the DinoV2 architecture, with transfer learning from ImageNet1k and data augmentation, against a CNN baseline. The DinoV2-B model achieves $Accuracy = 96.48\%$ and $F1 = 0.9727$, representing about a $10\%$ improvement over prior benchmarks, and its robustness is shown on HAM10000 and Dermnet. Explainability via GradCAM and SHAP localizes disease regions, supporting clinical adoption and potential mobile diagnostic tools.

Abstract

Skin diseases affect over a third of the global population, yet their impact is often underestimated. Automating skin disease classification to assist doctors with their prognosis might be difficult. Nevertheless, due to efficient feature extraction pipelines, deep learning techniques have shown much promise for various tasks, including dermatological disease identification. This study uses a skin disease dataset with 31 classes and compares it with all versions of Vision Transformers, Swin Transformers and DivoV2. The analysis is also extended to compare with benchmark convolution-based architecture presented in the literature. Transfer learning with ImageNet1k weights on the skin disease dataset contributes to a high test accuracy of 96.48\% and an F1-Score of 0.9727 using DinoV2, which is almost a 10\% improvement over this data's current benchmark results. The performance of DinoV2 was also compared for the HAM10000 and Dermnet datasets to test the model's robustness, and the trained model overcomes the benchmark results by a slight margin in test accuracy and in F1-Score on the 23 and 7 class datasets. The results are substantiated using explainable AI frameworks like GradCAM and SHAP, which provide precise image locations to map the disease, assisting dermatologists in early detection, prompt prognosis, and treatment.

Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

TL;DR

and

, representing about a

improvement over prior benchmarks, and its robustness is shown on HAM10000 and Dermnet. Explainability via GradCAM and SHAP localizes disease regions, supporting clinical adoption and potential mobile diagnostic tools.

Abstract

Paper Structure (22 sections, 7 equations, 16 figures, 8 tables)

This paper contains 22 sections, 7 equations, 16 figures, 8 tables.

Introduction
Related Works
Methodology
Dataset Description
Transformer Networks used
Vision Transformers
Swin Transformers
DinoV2
XAI for explainability
Experimental Setup
Results and Discussion
Results on the combined SDC dataset.
Explanability using XAI frameworks
Comparison of results on smaller benchmark datasets
Limitations and Future work
...and 7 more sections

Figures (16)

Figure 1: Sample images of each of the 31 classes (with abbreviations) of the SDC dataset ref26.
Figure 2: t-SNE plot of the train (left) and test (right) data.
Figure 3: Geometric augmentations used to upsample the dataset.
Figure 4: Train-Validation-Test data distribution for the unaugmented/raw and augmented datasets.
Figure 5: Overall methodology proposed in this work
...and 11 more figures

Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

TL;DR

Abstract

Enhancing Skin Disease Classification Leveraging Transformer-based Deep Learning Architectures and Explainable AI

Authors

TL;DR

Abstract

Table of Contents

Figures (16)