Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

Ye Du; Nanxi Yu; Shujun Wang

Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

Ye Du, Nanxi Yu, Shujun Wang

TL;DR

Medical image classification with vision-language models faces high fine-tuning costs. This paper introduces CILMP, a framework that leverages disease-specific knowledge from large language models to generate instance-adaptive prompts for vision-language models through a conditional, low-rank intervention mechanism. Across 11 datasets and multiple modalities, CILMP consistently outperforms state-of-the-art prompt-tuning methods while using only a fraction of trainable parameters, approaching the performance of full fine-tuning. The approach demonstrates the practical value of transferring medical knowledge from LLMs into prompt tuning, enabling robust, efficient, and scalable adaptation of VLMs for clinical tasks.

Abstract

Vision-language foundation models (VLMs) have shown great potential in feature transfer and generalization across a wide spectrum of medical-related downstream tasks. However, fine-tuning these models is resource-intensive due to their large number of parameters. Prompt tuning has emerged as a viable solution to mitigate memory usage and reduce training time while maintaining competitive performance. Nevertheless, the challenge is that existing prompt tuning methods cannot precisely distinguish different kinds of medical concepts, which miss essentially specific disease-related features across various medical imaging modalities in medical image classification tasks. We find that Large Language Models (LLMs), trained on extensive text corpora, are particularly adept at providing this specialized medical knowledge. Motivated by this, we propose incorporating LLMs into the prompt tuning process. Specifically, we introduce the CILMP, Conditional Intervention of Large Language Models for Prompt Tuning, a method that bridges LLMs and VLMs to facilitate the transfer of medical knowledge into VLM prompts. CILMP extracts disease-specific representations from LLMs, intervenes within a low-rank linear subspace, and utilizes them to create disease-specific prompts. Additionally, a conditional mechanism is incorporated to condition the intervention process on each individual medical image, generating instance-adaptive prompts and thus enhancing adaptability. Extensive experiments across diverse medical image datasets demonstrate that CILMP consistently outperforms state-of-the-art prompt tuning methods, demonstrating its effectiveness. Code is available at https://github.com/usr922/cilmp.

Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

TL;DR

Abstract

Medical Knowledge Intervention Prompt Tuning for Medical Image Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)