Table of Contents
Fetching ...

Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting

Wei Li, Jingyang Zhang, Pheng-Ann Heng, Lixu Gu

TL;DR

The paper tackles concurrent appearance and semantic forgetting in Task-Incremental Learning for medical image segmentation across diverse tasks. It introduces Comprehensive Generative Replay (CGR), combining a Bayesian Joint Diffusion model to synthesize image–mask pairs with preserved correspondence and a Task-Oriented Adapter to tailor diffusion via task-conditioned prompts. It employs memory-evoking replay and updating to maintain past task knowledge while incorporating new information. Experiments on cardiac, fundus, and prostate segmentation demonstrate reduced forgetting and competitive performance against JointTrain, DIL, and CIL baselines, with strong robustness to learning order and clear ablations showing the value of BJD and TOA.

Abstract

Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR.

Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting

TL;DR

The paper tackles concurrent appearance and semantic forgetting in Task-Incremental Learning for medical image segmentation across diverse tasks. It introduces Comprehensive Generative Replay (CGR), combining a Bayesian Joint Diffusion model to synthesize image–mask pairs with preserved correspondence and a Task-Oriented Adapter to tailor diffusion via task-conditioned prompts. It employs memory-evoking replay and updating to maintain past task knowledge while incorporating new information. Experiments on cardiac, fundus, and prostate segmentation demonstrate reduced forgetting and competitive performance against JointTrain, DIL, and CIL baselines, with strong robustness to learning order and clear ablations showing the value of BJD and TOA.

Abstract

Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR.
Paper Structure (9 sections, 7 equations, 3 figures, 1 table)

This paper contains 9 sections, 7 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Illustration of our proposed Comprehensive Generative Replay (CGR) framework for task-incremental learning, e.g., on prostate, fundus, and cardiac segmentation. Specifically, we synthesize paired images and segmentation masks to simulate past task data, by adopting a Bayesian Joint Diffusion (BJD) model to preserve image-mask correspondence (Sec. \ref{['sec:BJD']}), and equipping a Task-Oriented Adapter (TOA) on the CLIP-based embedding to modulate the diffusion model for scalable data synthesis (Sec. \ref{['sec:TOA']}). When encountering a new task, we leverage replayed past task data to evoke the faded memory, and update it to include this new task knowledge for future replays (Sec. \ref{['sec:replay']}).
  • Figure 2: Segmentation examples by learning on sequentially arriving tasks for cardiac, fundus, and prostate segmentation.
  • Figure 3: Ablation analysis. Synthesized image-mask pairs are displayed in (a) with and without BJD, and their t-SNE visualization is shown in (b) with and without TOA.