Table of Contents
Fetching ...

MedPEFT-CL: Dual-Phase Parameter-Efficient Continual Learning with Medical Semantic Adapter and Bidirectional Memory Consolidation

Ziyuan Gao

TL;DR

This work tackles catastrophic forgetting when adapting medical vision-language segmentation models to new anatomical tasks. It introduces MedPEFT-CL, a dual-phase framework built on the CLIPSeg backbone that combines a semantic similarity-driven adapter allocation with bidirectional Fisher-memory consolidation to enable continual, parameter-efficient learning. The adaptive phase uses medical-informed prompts and cross-modal LoRA to rapidly adapt to new tasks with minimal trainable parameters ($0.24$-$0.39$M per task), while the consolidation phase maintains prior knowledge via a memory buffer and two-direction Fisher-based replay. Across diverse medical datasets and task sequences, MedPEFT-CL consistently reduces forgetting and preserves performance, achieving state-of-the-art results with significantly lower parameter overhead and competitive computational costs, making it well-suited for clinical deployment.

Abstract

Medical vision-language segmentation models suffer from catastrophic forgetting when adapting to new anatomical structures, requiring complete retraining that limits their clinical deployment. Although continual learning approaches have been studied for various applications, targeted research on continual learning approaches specifically designed for medical vision-language tasks remains underexplored. We propose MedPEFT-CL, a parameter-efficient continual learning framework that addresses both efficient learning of new tasks and preservation of previous knowledge through a dual-phase architecture based on CLIPSeg. Our dual-phase architecture features an adaptive learning phase that employs semantic similarity-based adapter allocation and parameter-efficient fine-tuning for medical tasks through prompt similarity analysis, and a knowledge consolidation phase employing bi-directional Fisher-memory coordination. This creates a reinforcing cycle: consolidation directs replay priorities while new tasks provide challenging samples that improve retention strategies. Our key contributions are: (1) a semantic-driven adapter allocation mechanism that enables efficient learning of new medical tasks, (2) a bi-modal LoRA adaptation that significantly reduces trainable parameters while maintaining cross-modal learning, and (3) bidirectional Fisher-memory coordination that prevents catastrophic forgetting from previous medical tasks. Extensive experiments across diverse medical datasets demonstrate superior forgetting mitigation and performance retention with minimal parameter overhead, making the framework effective for continual learning in medical vision-language scenarios.

MedPEFT-CL: Dual-Phase Parameter-Efficient Continual Learning with Medical Semantic Adapter and Bidirectional Memory Consolidation

TL;DR

This work tackles catastrophic forgetting when adapting medical vision-language segmentation models to new anatomical tasks. It introduces MedPEFT-CL, a dual-phase framework built on the CLIPSeg backbone that combines a semantic similarity-driven adapter allocation with bidirectional Fisher-memory consolidation to enable continual, parameter-efficient learning. The adaptive phase uses medical-informed prompts and cross-modal LoRA to rapidly adapt to new tasks with minimal trainable parameters (-M per task), while the consolidation phase maintains prior knowledge via a memory buffer and two-direction Fisher-based replay. Across diverse medical datasets and task sequences, MedPEFT-CL consistently reduces forgetting and preserves performance, achieving state-of-the-art results with significantly lower parameter overhead and competitive computational costs, making it well-suited for clinical deployment.

Abstract

Medical vision-language segmentation models suffer from catastrophic forgetting when adapting to new anatomical structures, requiring complete retraining that limits their clinical deployment. Although continual learning approaches have been studied for various applications, targeted research on continual learning approaches specifically designed for medical vision-language tasks remains underexplored. We propose MedPEFT-CL, a parameter-efficient continual learning framework that addresses both efficient learning of new tasks and preservation of previous knowledge through a dual-phase architecture based on CLIPSeg. Our dual-phase architecture features an adaptive learning phase that employs semantic similarity-based adapter allocation and parameter-efficient fine-tuning for medical tasks through prompt similarity analysis, and a knowledge consolidation phase employing bi-directional Fisher-memory coordination. This creates a reinforcing cycle: consolidation directs replay priorities while new tasks provide challenging samples that improve retention strategies. Our key contributions are: (1) a semantic-driven adapter allocation mechanism that enables efficient learning of new medical tasks, (2) a bi-modal LoRA adaptation that significantly reduces trainable parameters while maintaining cross-modal learning, and (3) bidirectional Fisher-memory coordination that prevents catastrophic forgetting from previous medical tasks. Extensive experiments across diverse medical datasets demonstrate superior forgetting mitigation and performance retention with minimal parameter overhead, making the framework effective for continual learning in medical vision-language scenarios.

Paper Structure

This paper contains 26 sections, 12 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: We propose MedPEFT-CL, a PEFT framework that addresses both efficient learning of new medical tasks and preservation of previous knowledge through a dual-phase architecture for medical visual-language segmentation model.
  • Figure 2: Pipeline of MedPEFT-CL. Our method creates a reinforcement cycle between knowledge consolidation and new task learning. The consolidation phase prevents catastrophic forgetting through bidirectional Fisher-memory feedback, while the adaptive learning phase enables efficient adaptation to new medical segmentation tasks via semantic similarity-based adapter allocation. This cycle ensures that challenging samples enhance importance computation while preserving cross-modal knowledge.
  • Figure 3: Mixed Task Sequence: Kvasir$\rightarrow$CheX$\rightarrow$Colon$\rightarrow$BUSI$\rightarrow$Clinic$\rightarrow$ISIC$\rightarrow$ETIS$\rightarrow$Camus$\rightarrow$CVC300$\rightarrow$DFU.Visualization for mixed task sequences. Segmentation results from top row to bottom: Ground-truth, Ours, Moe-adapter, VR-LwF, DER.