Table of Contents
Fetching ...

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

Yanqi Ge, Qiang Nie, Ye Huang, Yong Liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

TL;DR

Prototype-based representation learning often relies on prototypes derived from evolving features, which can drift and bias learning under long-tailed data. The authors propose Semantic Anchor Regularization (SAR), decoupling anchors from feature learning by embedding predefined class anchors into a semantic space and guiding learning through a classifier-aware auxiliary task with disentanglement strategies. SAR promotes intra-class compactness and inter-class separability, reduces reliance on biased prototypes, and improves robustness to long-tail distributions, demonstrated across Cityscapes, ADE20K, and Pascal-Context with multiple backbones. The method is plug-and-play with minimal overhead and outperforms previous prototype-based methods while providing qualitative and analytical insights.

Abstract

One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes. Many outstanding metric-based and prototype-based methods following the Expectation-Maximization paradigm, have been proposed for this objective. However, they inevitably introduce biases into the learning process, particularly with long-tail distributed training data. In this paper, we reveal that the class prototype is not necessarily to be derived from training features and propose a novel perspective to use pre-defined class anchors serving as feature centroid to unidirectionally guide feature learning. However, the pre-defined anchors may have a large semantic distance from the pixel features, which prevents them from being directly applied. To address this issue and generate feature centroid independent from feature learning, a simple yet effective Semantic Anchor Regularization (SAR) is proposed. SAR ensures the interclass separability of semantic anchors in the semantic space by employing a classifier-aware auxiliary cross-entropy loss during training via disentanglement learning. By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem. The proposed SAR can be used in a plug-and-play manner in the existing models. Extensive experiments demonstrate that the SAR performs better than previous sophisticated prototype-based methods. The implementation is available at https://github.com/geyanqi/SAR.

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

TL;DR

Prototype-based representation learning often relies on prototypes derived from evolving features, which can drift and bias learning under long-tailed data. The authors propose Semantic Anchor Regularization (SAR), decoupling anchors from feature learning by embedding predefined class anchors into a semantic space and guiding learning through a classifier-aware auxiliary task with disentanglement strategies. SAR promotes intra-class compactness and inter-class separability, reduces reliance on biased prototypes, and improves robustness to long-tail distributions, demonstrated across Cityscapes, ADE20K, and Pascal-Context with multiple backbones. The method is plug-and-play with minimal overhead and outperforms previous prototype-based methods while providing qualitative and analytical insights.

Abstract

One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes. Many outstanding metric-based and prototype-based methods following the Expectation-Maximization paradigm, have been proposed for this objective. However, they inevitably introduce biases into the learning process, particularly with long-tail distributed training data. In this paper, we reveal that the class prototype is not necessarily to be derived from training features and propose a novel perspective to use pre-defined class anchors serving as feature centroid to unidirectionally guide feature learning. However, the pre-defined anchors may have a large semantic distance from the pixel features, which prevents them from being directly applied. To address this issue and generate feature centroid independent from feature learning, a simple yet effective Semantic Anchor Regularization (SAR) is proposed. SAR ensures the interclass separability of semantic anchors in the semantic space by employing a classifier-aware auxiliary cross-entropy loss during training via disentanglement learning. By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem. The proposed SAR can be used in a plug-and-play manner in the existing models. Extensive experiments demonstrate that the SAR performs better than previous sophisticated prototype-based methods. The implementation is available at https://github.com/geyanqi/SAR.
Paper Structure (40 sections, 9 equations, 4 figures, 20 tables)

This paper contains 40 sections, 9 equations, 4 figures, 20 tables.

Figures (4)

  • Figure 1: The difference between prototypes and semantic anchors in feature space (UMAP-Based). We train HRNet with two different seeds on Cityscapes to get these prototypes and semantic anchors. Shapes, colors, and CD represent random seeds, classes, and class dependencies, respectively. The generation of semantic anchors is independent of the main task, and it achieves more consistent and weaker inter-class dependencies on imbalanced data.
  • Figure 2: Framework of the proposed method which consists of a main stream (lower stream) for segmentation/classification and an auxiliary stream (the upper stream) for SAR. Pre-defined class anchors are first embedded into the semantic space to mitigate the semantic gap and then categorized by the classifier of the mainstream. The embedded anchors are ensembled into semantic anchors in an EMA manner. The learned feature with dimension is pulled to the corresponding semantic anchor for better intra-class compactness and inter-class separability. Bold pink lines highlight the proposed SAR.
  • Figure 3: Visualization of the learned features with HRNet and SAR on Cityscapes utilizing UMAP.
  • Figure 4: Qualitative results on ADE20K (L. $2$ Cols.), Cityscapes (M. $2$ Cols.), and Pascal-Context (R. $2$ Cols.).