Table of Contents
Fetching ...

Unsupervised Contrastive Learning for Efficient and Robust Spectral Shape Matching

Feifan Luo, Hongyang Chen

Abstract

Estimating correspondences between pairs of non-rigid deformable 3D shapes remains a significant challenge in computer vision and graphics. While deep functional map methods have become the go-to solution for addressing this problem, they primarily focus on optimizing pointwise and functional maps either individually or jointly, rather than directly enhancing feature representations in the embedding space, which often results in inadequate feature quality and suboptimal matching performance. Furthermore, these approaches heavily rely on traditional functional map techniques, such as time-consuming functional map solvers, which incur substantial computational costs. In this work, we introduce, for the first time, a novel unsupervised contrastive learning-based approach for efficient and robust 3D shape matching. We begin by presenting an unsupervised contrastive learning framework that promotes feature learning by maximizing consistency within positive similarity pairs and minimizing it within negative similarity pairs, thereby improving both the consistency and discriminability of the learned features.We then design a significantly simplified functional map learning architecture that eliminates the need for computationally expensive functional map solvers and multiple auxiliary functional map losses, greatly enhancing computational efficiency. By integrating these two components into a unified two-branch pipeline, our method achieves state-of-the-art performance in both accuracy and efficiency. Extensive experiments demonstrate that our approach is not only computationally efficient but also outperforms current state-of-the-art methods across various challenging benchmarks, including near-isometric, non-isometric, and topologically inconsistent scenarios, even surpassing supervised techniques.

Unsupervised Contrastive Learning for Efficient and Robust Spectral Shape Matching

Abstract

Estimating correspondences between pairs of non-rigid deformable 3D shapes remains a significant challenge in computer vision and graphics. While deep functional map methods have become the go-to solution for addressing this problem, they primarily focus on optimizing pointwise and functional maps either individually or jointly, rather than directly enhancing feature representations in the embedding space, which often results in inadequate feature quality and suboptimal matching performance. Furthermore, these approaches heavily rely on traditional functional map techniques, such as time-consuming functional map solvers, which incur substantial computational costs. In this work, we introduce, for the first time, a novel unsupervised contrastive learning-based approach for efficient and robust 3D shape matching. We begin by presenting an unsupervised contrastive learning framework that promotes feature learning by maximizing consistency within positive similarity pairs and minimizing it within negative similarity pairs, thereby improving both the consistency and discriminability of the learned features.We then design a significantly simplified functional map learning architecture that eliminates the need for computationally expensive functional map solvers and multiple auxiliary functional map losses, greatly enhancing computational efficiency. By integrating these two components into a unified two-branch pipeline, our method achieves state-of-the-art performance in both accuracy and efficiency. Extensive experiments demonstrate that our approach is not only computationally efficient but also outperforms current state-of-the-art methods across various challenging benchmarks, including near-isometric, non-isometric, and topologically inconsistent scenarios, even surpassing supervised techniques.
Paper Structure (25 sections, 19 equations, 7 figures, 5 tables)

This paper contains 25 sections, 19 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: A visual example illustrating the similarity score $\mathbf{S}_{\mathcal{XY}}$, the positive similarity pair $\mathbf{S}^{+}_{\mathcal{XY}}$, and the negative similarity pair $\mathbf{S}^{-}_{\mathcal{XY}}$.
  • Figure 2: A visual example highlighting the functionality of two contrastive losses. The Euclidean distance between the feature of different points $x_1$ on shape $\mathcal{X}$, and $y_0$, $y_1$ on shape $\mathcal{Y}$, to the source point $x_0$ is computed, where hotter/colder colors mean smaller/larger distances.
  • Figure 3: An overview of our method. (1) Feature Extraction: Learned features $\mathbf{F}_{\mathcal{X}}$ and $\mathbf{F}_{\mathcal{Y}}$ are extracted from shapes $\mathcal{X}$ and $\mathcal{Y}$, respectively. (2) Unsupervised Contrastive Learning Branch: The learned features are used to generate positive and negative similarity pairs $\mathbf{S}_{\mathcal{XY}}^{+}$, $\mathbf{S}_{\mathcal{XY}}^{-}$, $\mathbf{S}_{\mathcal{XX}}^{-}$ via the hybrid similarity generator. Two unsupervised contrastive losses Eq. \ref{['equ: inter constraive loss']} and Eq. \ref{['equ: self constraive loss']} are then applied for feature enhancement. (3) Simplified Functional Map Branch: The differentiable pointwise map ${\Pi}_\mathcal{XY}$ is computed using the softmax operator Eq. \ref{['eq: compute soft map']}, and the functional map ${\mathbf{C}}_\mathcal{YX}$ is calculated via spectral basis projection Eq. \ref{['equ: compute C by Pi']}. A functional loss Eq. \ref{['equ: aglin loss']} is constructed to supervise both pointwise and functional map learning.
  • Figure 4: Comparisons with other methods on shape matching with topological noise lahner2016shrec. Our results, with smoother and more accurate texture distributions, illustrate that our approach is more robust to topological noise compared to existing methods
  • Figure 5: Runtime comparison with different numbers of vertices. Left: Average training time per method (100 iterations). Right: Average inference time per method (100 shape pairs), where ULRSSM(+TTA) runtime (black line) is plotted against the right-hand axis due to its exceptional magnitude ($\geq$ 50× longer than other methods), the remaining competing approaches are scaled against the left-hand axis. Our method achieves best performance
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 1