Table of Contents
Fetching ...

DynaSeg: A Deep Dynamic Fusion Method for Unsupervised Image Segmentation Incorporating Feature Similarity and Spatial Continuity

Boujemaa Guermazi, Naimul Khan

TL;DR

DynaSeg is introduced, an innovative unsupervised image segmentation approach that overcomes the challenge of balancing feature similarity and spatial continuity without relying on extensive hyperparameter tuning.

Abstract

Our work tackles the fundamental challenge of image segmentation in computer vision, which is crucial for diverse applications. While supervised methods demonstrate proficiency, their reliance on extensive pixel-level annotations limits scalability. We introduce DynaSeg, an innovative unsupervised image segmentation approach that overcomes the challenge of balancing feature similarity and spatial continuity without relying on extensive hyperparameter tuning. Unlike traditional methods, DynaSeg employs a dynamic weighting scheme that automates parameter tuning, adapts flexibly to image characteristics, and facilitates easy integration with other segmentation networks. By incorporating a Silhouette Score Phase, DynaSeg prevents undersegmentation failures where the number of predicted clusters might converge to one. DynaSeg uses CNN-based and pre-trained ResNet feature extraction, making it computationally efficient and more straightforward than other complex models. Experimental results showcase state-of-the-art performance, achieving a 12.2% and 14.12% mIOU improvement over current unsupervised segmentation approaches on COCO-All and COCO-Stuff datasets, respectively. We provide qualitative and quantitative results on five benchmark datasets, demonstrating the efficacy of the proposed approach. Code available at Code available at https://github.com/RyersonMultimediaLab/DynaSeg

DynaSeg: A Deep Dynamic Fusion Method for Unsupervised Image Segmentation Incorporating Feature Similarity and Spatial Continuity

TL;DR

DynaSeg is introduced, an innovative unsupervised image segmentation approach that overcomes the challenge of balancing feature similarity and spatial continuity without relying on extensive hyperparameter tuning.

Abstract

Our work tackles the fundamental challenge of image segmentation in computer vision, which is crucial for diverse applications. While supervised methods demonstrate proficiency, their reliance on extensive pixel-level annotations limits scalability. We introduce DynaSeg, an innovative unsupervised image segmentation approach that overcomes the challenge of balancing feature similarity and spatial continuity without relying on extensive hyperparameter tuning. Unlike traditional methods, DynaSeg employs a dynamic weighting scheme that automates parameter tuning, adapts flexibly to image characteristics, and facilitates easy integration with other segmentation networks. By incorporating a Silhouette Score Phase, DynaSeg prevents undersegmentation failures where the number of predicted clusters might converge to one. DynaSeg uses CNN-based and pre-trained ResNet feature extraction, making it computationally efficient and more straightforward than other complex models. Experimental results showcase state-of-the-art performance, achieving a 12.2% and 14.12% mIOU improvement over current unsupervised segmentation approaches on COCO-All and COCO-Stuff datasets, respectively. We provide qualitative and quantitative results on five benchmark datasets, demonstrating the efficacy of the proposed approach. Code available at Code available at https://github.com/RyersonMultimediaLab/DynaSeg
Paper Structure (31 sections, 8 equations, 6 figures, 6 tables)

This paper contains 31 sections, 8 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overview of the DynaSeg Framework: The Feature Extractor Network generates a feature map, classified into $q$ clusters via a linear classifier and batch normalization, resulting in a normalized response map. Cluster labels $c_i$ are assigned to each pixel using the argmax function. The number of clusters $q'$ is dynamically updated based on feature similarity and spatial continuity. Loss $L$ is computed during backpropagation, with parameters updated using SGD. This process iterates $T$ times to refine cluster labels $c_i$, achieving final segmentation. The Silhouette Score sets $opt\_nC$ as the threshold for $q'$ to prevent under-segmentation. Black arrows indicate the feedforward path, while red arrows represent backpropagation.
  • Figure 2: Results for different $\mu$ values on a sample image from the BSD500 dataset.
  • Figure 3: Qualitative Results on select BSD500 and PASCAL VOC2012 images. Same color corresponds to the pixels being assigned the same clustering label by the algorithm. Please read Section \ref{['qualres']} for discussion on these results.
  • Figure 4: Qualitative results on Pascal VOC 2012: Original image, DynaSeg - SCF predicted segmentation, and Diff predicted segmentation.
  • Figure 5: Qualitative Results on select Icoseg and Pixabay images. Same color corresponds to the pixels being assigned the same clustering label by the algorithm. Please read Section \ref{['qualres']} for discussion on these results.
  • ...and 1 more figures