Table of Contents
Fetching ...

OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation

Baran Ozaydin, Tong Zhang, Deblina Bhattacharjee, Sabine Süsstrunk, Mathieu Salzmann

TL;DR

Unsupervised Semantic Segmentation struggles when learned features do not align with semantic concepts due to lack of explicit class definitions. OMH introduces a differentiable Optimally Matched Hierarchy that imposes structured sparsity over a multi-level set of cluster heads via Optimal Transport, enabling multi-granular semantic representations. The approach integrates with existing USS frameworks and achieves state-of-the-art performance on COCOStuff, Cityscapes, and Potsdam in both mIoU and accuracy, while remaining training-efficient. Code compatibility and public release potential make structured sparsity a practical enhancer for USS and a foundation for future hierarchical representations in unsupervised learning.

Abstract

Unsupervised Semantic Segmentation (USS) involves segmenting images without relying on predefined labels, aiming to alleviate the burden of extensive human labeling. Existing methods utilize features generated by self-supervised models and specific priors for clustering. However, their clustering objectives are not involved in the optimization of the features during training. Additionally, due to the lack of clear class definitions in USS, the resulting segments may not align well with the clustering objective. In this paper, we introduce a novel approach called Optimally Matched Hierarchy (OMH) to simultaneously address the above issues. The core of our method lies in imposing structured sparsity on the feature space, which allows the features to encode information with different levels of granularity. The structure of this sparsity stems from our hierarchy (OMH). To achieve this, we learn a soft but sparse hierarchy among parallel clusters through Optimal Transport. Our OMH yields better unsupervised segmentation performance compared to existing USS methods. Our extensive experiments demonstrate the benefits of OMH when utilizing our differentiable paradigm. We will make our code publicly available.

OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation

TL;DR

Unsupervised Semantic Segmentation struggles when learned features do not align with semantic concepts due to lack of explicit class definitions. OMH introduces a differentiable Optimally Matched Hierarchy that imposes structured sparsity over a multi-level set of cluster heads via Optimal Transport, enabling multi-granular semantic representations. The approach integrates with existing USS frameworks and achieves state-of-the-art performance on COCOStuff, Cityscapes, and Potsdam in both mIoU and accuracy, while remaining training-efficient. Code compatibility and public release potential make structured sparsity a practical enhancer for USS and a foundation for future hierarchical representations in unsupervised learning.

Abstract

Unsupervised Semantic Segmentation (USS) involves segmenting images without relying on predefined labels, aiming to alleviate the burden of extensive human labeling. Existing methods utilize features generated by self-supervised models and specific priors for clustering. However, their clustering objectives are not involved in the optimization of the features during training. Additionally, due to the lack of clear class definitions in USS, the resulting segments may not align well with the clustering objective. In this paper, we introduce a novel approach called Optimally Matched Hierarchy (OMH) to simultaneously address the above issues. The core of our method lies in imposing structured sparsity on the feature space, which allows the features to encode information with different levels of granularity. The structure of this sparsity stems from our hierarchy (OMH). To achieve this, we learn a soft but sparse hierarchy among parallel clusters through Optimal Transport. Our OMH yields better unsupervised segmentation performance compared to existing USS methods. Our extensive experiments demonstrate the benefits of OMH when utilizing our differentiable paradigm. We will make our code publicly available.
Paper Structure (18 sections, 11 equations, 5 figures, 7 tables)

This paper contains 18 sections, 11 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Our method represents classes and their parts more accurately. Our OMH models intercluster relationships explicitly in a hierarchy and leads to better coverage of classes. Our method can link the head of the human to its body.
  • Figure 2: Overview of our method. We impose sparsity on the features via clustering loss $L^\text{cluster}_i$ and bring a structure to this sparsity via a Wasserstein loss $L^\text{match}$. Our $L^\text{match}$ limits the intersection between clusters from different hierarchy levels using Optimal Transport cuturi2013sinkhornot.
  • Figure 3: Hierarchy matrix with Optimal Transport. Optimal Transport cuturi2013sinkhornot leads to a sparse and balanced hierarchy. On the left, we visualize the cluster center affinities $sim^+(\bm{H}^{(i)}, \bm{H}^{(i+1)})$ which is neither sparse (many positive similarities) nor balanced (unequal row or column sums). On the right, the optimal transportation plan $\mathbf{A}^{(i,i+1)}$ is visualized. In $\mathbf{A}^{(i,i+1)}$, two lower-level clusters are mapped to one higher-level cluster, on average, but we mostly see that some higher-level clusters are mapped to three or four lower-level clusters.
  • Figure 4: Our hierarchy represents both similar classes and object parts. In the top row, middle column, we visualize the higher-level cluster whereas the bottom row visualizes the corresponding lower-level ones. Our hierarchy can reason about part-whole relationships and class-class relationships.
  • Figure 5: Qualitative results for OMH. Qualitative results demonstrating the contribution of our method. The third row is without OMH and the fourth one is with OMH. Results from Table \ref{['Tab.coco']} last line, without CRF.