Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting
Wei Li, Jingyang Zhang, Pheng-Ann Heng, Lixu Gu
TL;DR
The paper tackles concurrent appearance and semantic forgetting in Task-Incremental Learning for medical image segmentation across diverse tasks. It introduces Comprehensive Generative Replay (CGR), combining a Bayesian Joint Diffusion model to synthesize image–mask pairs with preserved correspondence and a Task-Oriented Adapter to tailor diffusion via task-conditioned prompts. It employs memory-evoking replay and updating to maintain past task knowledge while incorporating new information. Experiments on cardiac, fundus, and prostate segmentation demonstrate reduced forgetting and competitive performance against JointTrain, DIL, and CIL baselines, with strong robustness to learning order and clear ablations showing the value of BJD and TOA.
Abstract
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR.
