IPSeg: Image Posterior Mitigates Semantic Drift in Class-Incremental Segmentation
Xiao Yu, Yan Fang, Yao Zhao, Yunchao Wei
TL;DR
This work tackles semantic drift in class-incremental semantic segmentation (CISS) caused by separate optimization and noisy pseudo-labels. It proposes IPSeg, which integrates image posterior guidance to align optimization across incremental stages and permanent-temporary semantics decoupling to treat stable background/unknown semantics separately from dynamic foreground targets. Through extensive experiments on VOC 2012 and ADE20K, IPSeg achieves state-of-the-art performance, especially in long-term incremental scenarios, and demonstrates robustness to forgetting while efficiently leveraging memory buffers. The approach advances practical continual learning for pixel-wise tasks and highlights a trade-off between performance and memory/privacy considerations due to replay, with potential future work aiming to remove memory buffers.
Abstract
Class incremental learning aims to enable models to learn from sequential, non-stationary data streams across different tasks without catastrophic forgetting. In class incremental semantic segmentation (CISS), the semantic content of image pixels evolves over incremental phases, known as semantic drift. In this work, we identify two critical challenges in CISS that contribute to semantic drift and degrade performance. First, we highlight the issue of separate optimization, where different parts of the model are optimized in distinct incremental stages, leading to misaligned probability scales. Second, we identify noisy semantics arising from inappropriate pseudo-labeling, which results in sub-optimal results. To address these challenges, we propose a novel and effective approach, Image Posterior and Semantics Decoupling for Segmentation (IPSeg). IPSeg introduces two key mechanisms: (1) leveraging image posterior probabilities to align optimization across stages and mitigate the effects of separate optimization, and (2) employing semantics decoupling to handle noisy semantics and tailor learning strategies for different semantics. Extensive experiments on the Pascal VOC 2012 and ADE20K datasets demonstrate that IPSeg achieves superior performance compared to state-of-the-art methods, particularly in challenging long-term incremental scenarios.
