Fine-grained Prompt Tuning: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification

Yijin Huang; Pujin Cheng; Roger Tam; Xiaoying Tang

Fine-grained Prompt Tuning: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification

Yijin Huang, Pujin Cheng, Roger Tam, Xiaoying Tang

TL;DR

The paper tackles the memory bottleneck of applying large pre-trained models to high-resolution medical image classification. It proposes Fine-grained Prompt Tuning (FPT), a parameter-efficient transfer learning method that freezes the large pre-trained model and uses a lightweight side network, augmented by asymmetric input, fine-grained prompts, and a cross-attention-based fusion module to transfer knowledge efficiently. Additional mechanisms—important token selection and preloading of intermediate features—further reduce training memory while preserving performance, achieving about $1.8\%$ of learnable parameters and $13\%$ of the memory of a full ViT-B run with $512\times512$ inputs, while maintaining competitive AUC across four medical datasets. The approach demonstrates strong performance and favorable PPE/PME trade-offs, making large pre-trained models more practically usable for high-resolution medical imaging tasks.

Abstract

Parameter-efficient transfer learning (PETL) is proposed as a cost-effective way to transfer pre-trained models to downstream tasks, avoiding the high cost of updating entire large-scale pre-trained models (LPMs). In this work, we present Fine-grained Prompt Tuning (FPT), a novel PETL method for medical image classification. FPT significantly reduces memory consumption compared to other PETL methods, especially in high-resolution input contexts. To achieve this, we first freeze the weights of the LPM and construct a learnable lightweight side network. The frozen LPM takes high-resolution images as input to extract fine-grained features, while the side network is fed low-resolution images to reduce memory usage. To allow the side network to access pre-trained knowledge, we introduce fine-grained prompts that summarize information from the LPM through a fusion module. Important tokens selection and preloading techniques are employed to further reduce training cost and memory requirements. We evaluate FPT on four medical datasets with varying sizes, modalities, and complexities. Experimental results demonstrate that FPT achieves comparable performance to fine-tuning the entire LPM while using only 1.8% of the learnable parameters and 13% of the memory costs of an encoder ViT-B model with a 512 x 512 input resolution.

Fine-grained Prompt Tuning: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification

TL;DR

of learnable parameters and

of the memory of a full ViT-B run with

inputs, while maintaining competitive AUC across four medical datasets. The approach demonstrates strong performance and favorable PPE/PME trade-offs, making large pre-trained models more practically usable for high-resolution medical imaging tasks.

Abstract

Paper Structure (18 sections, 2 equations, 4 figures, 3 tables)

This paper contains 18 sections, 2 equations, 4 figures, 3 tables.

Introduction
Method
Side Tuning
Asymmetric Input
Fine-grained Prompts and Fusion Module
Important Token Selection
Fine-grained Features Preloading
Experiments
Datasets
Training and Evaluation Setup
Experiment setup
Evaluation metric
Comparisons with State-of-the-art
Impact of Components
Impact of Important Token Selection Ratio
...and 3 more sections

Figures (4)

Figure 1: High-resolution comes at the cost of heightened GPU memory consumption.
Figure 2: Our proposed FPT shows the best trade-off between performance and efficiency. The size of the dots represents memory usage.
Figure 3: The proposed FPT framework.
Figure 4: Components of FPT: (a) The gradient flow of FPT. (b) Important tokens selection mechanism. (c) Fine-grained features preloading.

Fine-grained Prompt Tuning: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification

TL;DR

Abstract

Fine-grained Prompt Tuning: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (4)