Table of Contents
Fetching ...

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

Raman Dutt, Linus Ericsson, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales

TL;DR

Medical image analysis faces data scarcity and annotation costs, motivating exploration of Parameter-Efficient Fine-Tuning (PEFT). This work presents a structured benchmark evaluating 17 PEFT methods across CNNs and ViTs on six medical datasets, including a novel text-to-image generation task, based on more than 700 controlled experiments. They find PEFT yields gains in low-data regimes, with improvements up to $22\%$ in discriminative and generative tasks, and gains scale with model size; LoRA excels for ViTs, SSF for CNNs, and simple bias/normal-tuning strategies boost diffusion-based generation. The benchmark provides practical guidance for method choice by task and data regime, enables fair comparisons, and establishes a platform to accelerate PEFT adoption in clinical workflows.

Abstract

Foundation models have significantly advanced medical image analysis through the pre-train fine-tune paradigm. Among various fine-tuning algorithms, Parameter-Efficient Fine-Tuning (PEFT) is increasingly utilized for knowledge transfer across diverse tasks, including vision-language and text-to-image generation. However, its application in medical image analysis is relatively unexplored due to the lack of a structured benchmark for evaluating PEFT methods. This study fills this gap by evaluating 17 distinct PEFT algorithms across convolutional and transformer-based networks on image classification and text-to-image generation tasks using six medical datasets of varying size, modality, and complexity. Through a battery of over 700 controlled experiments, our findings demonstrate PEFT's effectiveness, particularly in low data regimes common in medical imaging, with performance gains of up to 22% in discriminative and generative tasks. These recommendations can assist the community in incorporating PEFT into their workflows and facilitate fair comparisons of future PEFT methods, ensuring alignment with advancements in other areas of machine learning and AI.

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

TL;DR

Medical image analysis faces data scarcity and annotation costs, motivating exploration of Parameter-Efficient Fine-Tuning (PEFT). This work presents a structured benchmark evaluating 17 PEFT methods across CNNs and ViTs on six medical datasets, including a novel text-to-image generation task, based on more than 700 controlled experiments. They find PEFT yields gains in low-data regimes, with improvements up to in discriminative and generative tasks, and gains scale with model size; LoRA excels for ViTs, SSF for CNNs, and simple bias/normal-tuning strategies boost diffusion-based generation. The benchmark provides practical guidance for method choice by task and data regime, enables fair comparisons, and establishes a platform to accelerate PEFT adoption in clinical workflows.

Abstract

Foundation models have significantly advanced medical image analysis through the pre-train fine-tune paradigm. Among various fine-tuning algorithms, Parameter-Efficient Fine-Tuning (PEFT) is increasingly utilized for knowledge transfer across diverse tasks, including vision-language and text-to-image generation. However, its application in medical image analysis is relatively unexplored due to the lack of a structured benchmark for evaluating PEFT methods. This study fills this gap by evaluating 17 distinct PEFT algorithms across convolutional and transformer-based networks on image classification and text-to-image generation tasks using six medical datasets of varying size, modality, and complexity. Through a battery of over 700 controlled experiments, our findings demonstrate PEFT's effectiveness, particularly in low data regimes common in medical imaging, with performance gains of up to 22% in discriminative and generative tasks. These recommendations can assist the community in incorporating PEFT into their workflows and facilitate fair comparisons of future PEFT methods, ensuring alignment with advancements in other areas of machine learning and AI.
Paper Structure (29 sections, 4 equations, 3 figures, 6 tables)

This paper contains 29 sections, 4 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Plots showing the performance comparison for Full Fine-tuning, BitFit and LoRA with varying downstream dataset size for ViT-Base and ViT-Large models.
  • Figure 2: Figure showing text-to-image generation examples with the ground truth in the ascending average rank order (best five) for two data regimes. The input prompt for the generated samples is: "No acute cardiopulmonary process."
  • Figure 3: Performance vs. Parameter Count for ResNet50 and ViT-Base Encoders. The marker size indicates the tunable parameter count for each method.