Mitigating Background Shift in Class-Incremental Semantic Segmentation
Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, Jae-Pil Heo
TL;DR
This work tackles background shift in Class-Incremental Semantic Segmentation (CISS) by introducing a background-class separation framework. It combines selective pseudo-labeling to ignore unreliable old-class regions, adaptive feature distillation that weights knowledge transfer by patch reliability, and a separation strategy that decouples background from new classes via label-guided distillation and an orthogonality constraint on class tokens. The approach yields state-of-the-art performance on Pascal VOC and ADE20k under both disjoint and overlapped continual-learning settings, with strong evidence of improved stability (retaining old classes) and plasticity (learning new classes). The contributions offer practical improvements for continual semantic segmentation in dynamic environments where new concepts emerge over time.
Abstract
Class-Incremental Semantic Segmentation(CISS) aims to learn new classes without forgetting the old ones, using only the labels of the new classes. To achieve this, two popular strategies are employed: 1) pseudo-labeling and knowledge distillation to preserve prior knowledge; and 2) background weight transfer, which leverages the broad coverage of background in learning new classes by transferring background weight to the new class classifier. However, the first strategy heavily relies on the old model in detecting old classes while undetected pixels are regarded as the background, thereby leading to the background shift towards the old classes(i.e., misclassification of old class as background). Additionally, in the case of the second approach, initializing the new class classifier with background knowledge triggers a similar background shift issue, but towards the new classes. To address these issues, we propose a background-class separation framework for CISS. To begin with, selective pseudo-labeling and adaptive feature distillation are to distill only trustworthy past knowledge. On the other hand, we encourage the separation between the background and new classes with a novel orthogonal objective along with label-guided output distillation. Our state-of-the-art results validate the effectiveness of these proposed methods.
