Table of Contents
Fetching ...

Comparison of fine-tuning strategies for transfer learning in medical image classification

Ana Davila, Jacinto Colan, Yasuhisa Hasegawa

TL;DR

This study systematically compares eight fine-tuning strategies for transferring pre-trained CNNs to medical image classification across six datasets spanning X-ray, MRI, histology, dermoscopy, and endoscopy. Using three backbones—ResNet-50, DenseNet-121, and VGG-19—the authors show that no single method is universally optimal; linear probing is often weak, while LP-FT and Auto-RGN frequently yield robust gains, with DenseNet-121 particularly benefiting from non-standard fine-tuning approaches. Auto-RGN can deliver notable improvements, up to around 11% in some modalities, by dynamically adjusting per-layer learning rates. The findings provide practical guidance for practitioners seeking effective transfer learning strategies in diverse medical imaging tasks and point to opportunities for expanding analyses with additional architectures and fine-tuning methods.

Abstract

In the context of medical imaging and machine learning, one of the most pressing challenges is the effective adaptation of pre-trained models to specialized medical contexts. Despite the availability of advanced pre-trained models, their direct application to the highly specialized and diverse field of medical imaging often falls short due to the unique characteristics of medical data. This study provides a comprehensive analysis on the performance of various fine-tuning methods applied to pre-trained models across a spectrum of medical imaging domains, including X-ray, MRI, Histology, Dermoscopy, and Endoscopic surgery. We evaluated eight fine-tuning strategies, including standard techniques such as fine-tuning all layers or fine-tuning only the classifier layers, alongside methods such as gradually unfreezing layers, regularization based fine-tuning and adaptive learning rates. We selected three well-established CNN architectures (ResNet-50, DenseNet-121, and VGG-19) to cover a range of learning and feature extraction scenarios. Although our results indicate that the efficacy of these fine-tuning methods significantly varies depending on both the architecture and the medical imaging type, strategies such as combining Linear Probing with Full Fine-tuning resulted in notable improvements in over 50% of the evaluated cases, demonstrating general effectiveness across medical domains. Moreover, Auto-RGN, which dynamically adjusts learning rates, led to performance enhancements of up to 11% for specific modalities. Additionally, the DenseNet architecture showed more pronounced benefits from alternative fine-tuning approaches compared to traditional full fine-tuning. This work not only provides valuable insights for optimizing pre-trained models in medical image analysis but also suggests the potential for future research into more advanced architectures and fine-tuning methods.

Comparison of fine-tuning strategies for transfer learning in medical image classification

TL;DR

This study systematically compares eight fine-tuning strategies for transferring pre-trained CNNs to medical image classification across six datasets spanning X-ray, MRI, histology, dermoscopy, and endoscopy. Using three backbones—ResNet-50, DenseNet-121, and VGG-19—the authors show that no single method is universally optimal; linear probing is often weak, while LP-FT and Auto-RGN frequently yield robust gains, with DenseNet-121 particularly benefiting from non-standard fine-tuning approaches. Auto-RGN can deliver notable improvements, up to around 11% in some modalities, by dynamically adjusting per-layer learning rates. The findings provide practical guidance for practitioners seeking effective transfer learning strategies in diverse medical imaging tasks and point to opportunities for expanding analyses with additional architectures and fine-tuning methods.

Abstract

In the context of medical imaging and machine learning, one of the most pressing challenges is the effective adaptation of pre-trained models to specialized medical contexts. Despite the availability of advanced pre-trained models, their direct application to the highly specialized and diverse field of medical imaging often falls short due to the unique characteristics of medical data. This study provides a comprehensive analysis on the performance of various fine-tuning methods applied to pre-trained models across a spectrum of medical imaging domains, including X-ray, MRI, Histology, Dermoscopy, and Endoscopic surgery. We evaluated eight fine-tuning strategies, including standard techniques such as fine-tuning all layers or fine-tuning only the classifier layers, alongside methods such as gradually unfreezing layers, regularization based fine-tuning and adaptive learning rates. We selected three well-established CNN architectures (ResNet-50, DenseNet-121, and VGG-19) to cover a range of learning and feature extraction scenarios. Although our results indicate that the efficacy of these fine-tuning methods significantly varies depending on both the architecture and the medical imaging type, strategies such as combining Linear Probing with Full Fine-tuning resulted in notable improvements in over 50% of the evaluated cases, demonstrating general effectiveness across medical domains. Moreover, Auto-RGN, which dynamically adjusts learning rates, led to performance enhancements of up to 11% for specific modalities. Additionally, the DenseNet architecture showed more pronounced benefits from alternative fine-tuning approaches compared to traditional full fine-tuning. This work not only provides valuable insights for optimizing pre-trained models in medical image analysis but also suggests the potential for future research into more advanced architectures and fine-tuning methods.
Paper Structure (30 sections, 4 equations, 11 figures, 12 tables)

This paper contains 30 sections, 4 equations, 11 figures, 12 tables.

Figures (11)

  • Figure 1: Schematic representation of the transfer learning process, illustrating the knowledge transfer from a pre-trained source model to a target medical imaging dataset.
  • Figure 2: Illustration of evaluated fine-tuning methods. The blue blocks represent fine-tunable layers, while gray blocks are frozen layers (non fine-tunable). From top to bottom: (1) Fine-tuning strategies where all layers $\{l_1, \dots, l_N, FC\}$ are fine-tuned; (2) Linear Probing, where only the classifier layers $FC$ are retrained; (3) Gradual Unfreezing fine-tunable layers, starting from the last layer; (4) Gradual Unfreezing fine-tunable layers, starting from the first layer; (5) Training only the classifier layer initially, followed by fine-tuning all layers.
  • Figure 3: Representative images from the CheXpert, MURA, and T1w MRI datasets, showcasing a variety of radiographic and MRI imaging types.
  • Figure 4: Sample images from the BACH, ISIC 2020, and CholecT50 datasets, illustrating histology, dermoscopy, and endoscopic surgery imaging modalities.
  • Figure 5: Architectures of the pre-trained models.
  • ...and 6 more figures