Table of Contents
Fetching ...

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

Hongmei Yin, Tingliang Feng, Fan Lyu, Fanhua Shang, Hongying Liu, Wei Feng, Liang Wan

TL;DR

This work tackles continual semantic segmentation (CSS) by addressing background shift caused by partial labeling. It introduces Enhanced Instance Replay (EIR), a full pipeline that stores old-class instances, selectively combines them with new images based on contextual predictions, places them strategically in the image background, and trains with a Region-Specific Knowledge Distillation (RSKD) loss. The approach yields substantial improvements over state-of-the-art CSS methods on Pascal VOC 2012 and ADE20K, demonstrating stronger balance between retaining old knowledge and learning new classes. By mitigating background confusion in both stored and new images, EIR advances practical CSS capabilities and offers a scalable framework for future improvements in instance-aware continual learning.

Abstract

In this work, we focus on continual semantic segmentation (CSS), where segmentation networks are required to continuously learn new classes without erasing knowledge of previously learned ones. Although storing images of old classes and directly incorporating them into the training of new models has proven effective in mitigating catastrophic forgetting in classification tasks, this strategy presents notable limitations in CSS. Specifically, the stored and new images with partial category annotations leads to confusion between unannotated categories and the background, complicating model fitting. To tackle this issue, this paper proposes a novel Enhanced Instance Replay (EIR) method, which not only preserves knowledge of old classes while simultaneously eliminating background confusion by instance storage of old classes, but also mitigates background shifts in the new images by integrating stored instances with new images. By effectively resolving background shifts in both stored and new images, EIR alleviates catastrophic forgetting in the CSS task, thereby enhancing the model's capacity for CSS. Experimental results validate the efficacy of our approach, which significantly outperforms state-of-the-art CSS methods.

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

TL;DR

This work tackles continual semantic segmentation (CSS) by addressing background shift caused by partial labeling. It introduces Enhanced Instance Replay (EIR), a full pipeline that stores old-class instances, selectively combines them with new images based on contextual predictions, places them strategically in the image background, and trains with a Region-Specific Knowledge Distillation (RSKD) loss. The approach yields substantial improvements over state-of-the-art CSS methods on Pascal VOC 2012 and ADE20K, demonstrating stronger balance between retaining old knowledge and learning new classes. By mitigating background confusion in both stored and new images, EIR advances practical CSS capabilities and offers a scalable framework for future improvements in instance-aware continual learning.

Abstract

In this work, we focus on continual semantic segmentation (CSS), where segmentation networks are required to continuously learn new classes without erasing knowledge of previously learned ones. Although storing images of old classes and directly incorporating them into the training of new models has proven effective in mitigating catastrophic forgetting in classification tasks, this strategy presents notable limitations in CSS. Specifically, the stored and new images with partial category annotations leads to confusion between unannotated categories and the background, complicating model fitting. To tackle this issue, this paper proposes a novel Enhanced Instance Replay (EIR) method, which not only preserves knowledge of old classes while simultaneously eliminating background confusion by instance storage of old classes, but also mitigates background shifts in the new images by integrating stored instances with new images. By effectively resolving background shifts in both stored and new images, EIR alleviates catastrophic forgetting in the CSS task, thereby enhancing the model's capacity for CSS. Experimental results validate the efficacy of our approach, which significantly outperforms state-of-the-art CSS methods.

Paper Structure

This paper contains 17 sections, 3 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Comparison of traditional image replay (a) and our replay methods (b). (a) shows that only the old class "horse" is labeled in stored image, while other classes (new class "person" and old class "car") are labeled as background. Both old ("horse") and future classes in new images are labeled as background. (b) Our method avoids confusing information in stored image by retaining instances and alleviates background shift by fusing these instances in the new image.
  • Figure 2: Demonstration of four replay methods in CSS tasks. The figure above shows the detail implementation process of the image replay, vanilla instance replay, random copy-paste instance replay and enhanced instance replay.
  • Figure 3: Segmentation Results by Vanilla Instance Replay and the Random Copy-Paste Instance Replay. In the images above, the colored regions represent the model's predictions for the old classes, while the regions without additional color annotations indicate that the model predicted these areas as background. The corresponding numerical results are shown in Table \ref{['tab: replay comparison']}.
  • Figure 4: The detailed architecture of our method. Initially, we sample instances from the old data according to their classes. Subsequently, during the class combination, we identify the potential old classes through the old model. In the instance selection, we select the instances of potential classes from the instance pool. After that, we calculate the position in new image to replay the instance and fuse them with the new image to create a fused image. Finally, the fused image is trained in an enhanced way.
  • Figure 5: Segmentation results of our method and previous methods on Psacal VOC 2012.