Table of Contents
Fetching ...

Adaptive Margin Contrastive Learning for Ambiguity-aware 3D Semantic Segmentation

Yang Chen, Yueqi Duan, Runzhong Zhang, Yap-Peng Tan

TL;DR

An adaptive margin contrastive learning method for 3D point cloud semantic segmentation, namely AMContrast3D, aiming to ensure the correctness of low-ambiguity points while allowing mistakes for high-ambiguity points, which outperforms state-of-the-art methods.

Abstract

In this paper, we propose an adaptive margin contrastive learning method for 3D point cloud semantic segmentation, namely AMContrast3D. Most existing methods use equally penalized objectives, which ignore per-point ambiguities and less discriminated features stemming from transition regions. However, as highly ambiguous points may be indistinguishable even for humans, their manually annotated labels are less reliable, and hard constraints over these points would lead to sub-optimal models. To address this, we design adaptive objectives for individual points based on their ambiguity levels, aiming to ensure the correctness of low-ambiguity points while allowing mistakes for high-ambiguity points. Specifically, we first estimate ambiguities based on position embeddings. Then, we develop a margin generator to shift decision boundaries for contrastive feature embeddings, so margins are narrowed due to increasing ambiguities with even negative margins for extremely high-ambiguity points. Experimental results on large-scale datasets, S3DIS and ScanNet, demonstrate that our method outperforms state-of-the-art methods.

Adaptive Margin Contrastive Learning for Ambiguity-aware 3D Semantic Segmentation

TL;DR

An adaptive margin contrastive learning method for 3D point cloud semantic segmentation, namely AMContrast3D, aiming to ensure the correctness of low-ambiguity points while allowing mistakes for high-ambiguity points, which outperforms state-of-the-art methods.

Abstract

In this paper, we propose an adaptive margin contrastive learning method for 3D point cloud semantic segmentation, namely AMContrast3D. Most existing methods use equally penalized objectives, which ignore per-point ambiguities and less discriminated features stemming from transition regions. However, as highly ambiguous points may be indistinguishable even for humans, their manually annotated labels are less reliable, and hard constraints over these points would lead to sub-optimal models. To address this, we design adaptive objectives for individual points based on their ambiguity levels, aiming to ensure the correctness of low-ambiguity points while allowing mistakes for high-ambiguity points. Specifically, we first estimate ambiguities based on position embeddings. Then, we develop a margin generator to shift decision boundaries for contrastive feature embeddings, so margins are narrowed due to increasing ambiguities with even negative margins for extremely high-ambiguity points. Experimental results on large-scale datasets, S3DIS and ScanNet, demonstrate that our method outperforms state-of-the-art methods.

Paper Structure

This paper contains 15 sections, 10 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Adaptive margin from ambiguity. An illustration among (a) position embedding indicates per-point ambiguity $a_i$ colored by a map ranging from $0$ to $1$, and (b) feature embedding yields similarities of intra-pair $S^+$ and inter-pair $S^-$, using ambiguity-aware margin $m_i$ to adjust decision boundaries $DB^+$ and $DB^-$ in contrastive learning, which generates adaptive objectives to benefit embedding learning.
  • Figure 2: The AMContrast3D with encoder-decoder network architecture. In the ambiguity estimation framework following the $s^{th}$ encoder layer, we infer the ambiguity $a_i \in \mathcal{A}^s$ for each $i^{th}$ point by encoding position embeddings $p_i,p_j, p_k \in \mathcal{P}^s$ based on the $j^{th}$ intra-points in neighborhood $\mathcal{N}_i^+$ and the $k^{th}$ inter-points in neighborhood $\mathcal{N}_i^-$. We reformulate $a_i$ into adaptive ambiguity-aware margins $m_i \in \mathcal{M}^s$. These margins target feature embeddings $f_i,f_j,f_k \in \mathcal{F}^s$ for each corresponding decoder layer to dynamically adjust decision boundaries during contrastive learning. Through the adaptive margin contrastive learning, our method automatically regulates training difficulties across different parts of the point clouds, particularly ensuring more stabilized training for high-ambiguity points in transition regions containing different semantic classes.
  • Figure 3: Ambiguity visualization. A 3D point cloud scene is categorized by different semantic classes. We visualize the point-level ambiguity for each point, where the color from white to black indicates various ambiguity levels ranging in $[0,1]$.
  • Figure 4: Visualization results on S3DIS (Area 5). The images from left to right are the input scene, ground truth of semantic labels, results predicted by PointNeXt, and our method.
  • Figure 5: Visualization results on ScanNet. The images from left to right are the input scene, ground truth of semantic labels, results predicted by PointNeXt, and our method.