Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

Iván Matas; Carmen Serrano; Miguel Nogales; David Moreno; Lara Ferrándiz; Teresa Ojeda; Begoña Acha

Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

Iván Matas, Carmen Serrano, Miguel Nogales, David Moreno, Lara Ferrándiz, Teresa Ojeda, Begoña Acha

TL;DR

This work addresses overfitting and domain mismatch when using ImageNet pretraining for dermatology. It compares a self-supervised, domain-specific pretraining via a variational autoencoder (VAE) with a randomly initialized ConvNext-Tiny encoder against a traditional ImageNet-pretrained backbone, using an identical classifier and a dermatoscopic dataset augmented with ISIC data. Results show the ImageNet pathway converges quickly but overfits to non-clinical features, whereas the self-supervised approach exhibits steady improvement and stronger generalization, suggesting domain-specific pretraining can achieve robust clinical performance with further tuning. The study highlights the importance of tailoring pretraining strategies to medical imaging tasks to enhance diagnostic support and reliability in real-world settings.

Abstract

Deep learning has transformed computer vision but relies heavily on large labeled datasets and computational resources. Transfer learning, particularly fine-tuning pretrained models, offers a practical alternative; however, models pretrained on natural image datasets such as ImageNet may fail to capture domain-specific characteristics in medical imaging. This study introduces an unsupervised learning framework that extracts high-value dermatological features instead of relying solely on ImageNet-based pretraining. We employ a Variational Autoencoder (VAE) trained from scratch on a proprietary dermatological dataset, allowing the model to learn a structured and clinically relevant latent space. This self-supervised feature extractor is then compared to an ImageNet-pretrained backbone under identical classification conditions, highlighting the trade-offs between general-purpose and domain-specific pretraining. Our results reveal distinct learning patterns. The self-supervised model achieves a final validation loss of 0.110 (-33.33%), while the ImageNet-pretrained model stagnates at 0.100 (-16.67%), indicating overfitting. Accuracy trends confirm this: the self-supervised model improves from 45% to 65% (+44.44%) with a near-zero overfitting gap, whereas the ImageNet-pretrained model reaches 87% (+50.00%) but plateaus at 75% (+19.05%), with its overfitting gap increasing to +0.060. These findings suggest that while ImageNet pretraining accelerates convergence, it also amplifies overfitting on non-clinically relevant features. In contrast, self-supervised learning achieves steady improvements, stronger generalization, and superior adaptability, underscoring the importance of domain-specific feature extraction in medical imaging.

Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

TL;DR

Abstract

Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)