Decoupling Continual Semantic Segmentation

Yifu Guo; Yuquan Lu; Wentao Zhang; Zishan Xu; Dexia Chen; Siyu Zhang; Yizhe Zhang; Ruixuan Wang

Decoupling Continual Semantic Segmentation

Yifu Guo, Yuquan Lu, Wentao Zhang, Zishan Xu, Dexia Chen, Siyu Zhang, Yizhe Zhang, Ruixuan Wang

TL;DR

This paper tackles continual semantic segmentation by decoupling class-aware existence detection from class-agnostic segmentation. It introduces DecoupleCSS, a two-stage framework that uses language-guided detection with LoRA adapters to generate location-aware prompts, feeding SAM with class-specific prompts to obtain segmentation masks, while keeping the segmentation module frozen to promote knowledge sharing. The approach yields state-of-the-art results on PASCAL VOC 2012 and ADE20K across multiple CSS settings, with robust ablations showing gains from LoRA, per-class prompt generation, and semantic alignment. It demonstrates practical potential for leveraging vision-language foundation models in CSS with manageable memory overhead and predictable inference time, marking a strong step toward scalable continual learning for dense prediction tasks.

Abstract

Continual Semantic Segmentation (CSS) requires learning new classes without forgetting previously acquired knowledge, addressing the fundamental challenge of catastrophic forgetting in dense prediction tasks. However, existing CSS methods typically employ single-stage encoder-decoder architectures where segmentation masks and class labels are tightly coupled, leading to interference between old and new class learning and suboptimal retention-plasticity balance. We introduce DecoupleCSS, a novel two-stage framework for CSS. By decoupling class-aware detection from class-agnostic segmentation, DecoupleCSS enables more effective continual learning, preserving past knowledge while learning new classes. The first stage leverages pre-trained text and image encoders, adapted using LoRA, to encode class-specific information and generate location-aware prompts. In the second stage, the Segment Anything Model (SAM) is employed to produce precise segmentation masks, ensuring that segmentation knowledge is shared across both new and previous classes. This approach improves the balance between retention and adaptability in CSS, achieving state-of-the-art performance across a variety of challenging tasks. Our code is publicly available at: https://github.com/euyis1019/Decoupling-Continual-Semantic-Segmentation.

Decoupling Continual Semantic Segmentation

TL;DR

Abstract

Decoupling Continual Semantic Segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)