Learning at a Glance: Towards Interpretable Data-limited Continual Semantic Segmentation via Semantic-Invariance Modelling
Bo Yuan, Danpei Zhao, Zhenwei Shi
TL;DR
This work tackles semantic segmentation under continual learning with limited incremental data by introducing Learning at a Glance (LAG), a semantically principled, interpretable framework. LAG decomposes representations into semantic-invariant and sample-specific components via channel-wise decoupling and neuron-relevant semantic consistency, and couples this with a disentangled distillation regime (SPM and SFP) plus an explicit unknown class and uncertainty-aware pseudo-labelling. The approach achieves competitive CSS performance across VOC, ADE20K, and ISPRS, demonstrating strong anti-forgetting and adaptability under data-limited conditions, while providing interpretability via neuron-level relevance analysis. This work advances practical CSS by combining memory-free learning, interpretable constraints, and data-efficient protocols with broad applicability to real-world scenarios.
Abstract
Continual semantic segmentation (CSS) based on incremental learning (IL) is a great endeavour in developing human-like segmentation models. However, current CSS approaches encounter challenges in the trade-off between preserving old knowledge and learning new ones, where they still need large-scale annotated data for incremental training and lack interpretability. In this paper, we present Learning at a Glance (LAG), an efficient, robust, human-like and interpretable approach for CSS. Specifically, LAG is a simple and model-agnostic architecture, yet it achieves competitive CSS efficiency with limited incremental data. Inspired by human-like recognition patterns, we propose a semantic-invariance modelling approach via semantic features decoupling that simultaneously reconciles solid knowledge inheritance and new-term learning. Concretely, the proposed decoupling manner includes two ways, i.e., channel-wise decoupling and spatial-level neuron-relevant semantic consistency. Our approach preserves semantic-invariant knowledge as solid prototypes to alleviate catastrophic forgetting, while also constraining sample-specific contents through an asymmetric contrastive learning method to enhance model robustness during IL steps. Experimental results in multiple datasets validate the effectiveness of the proposed method. Furthermore, we introduce a novel CSS protocol that better reflects realistic data-limited CSS settings, and LAG achieves superior performance under multiple data-limited conditions.
