Table of Contents
Fetching ...

Semantic Segmentation by Semantic Proportions

Halil Ibrahim Aysel, Xiaohao Cai, Adam Prügel-Bennett

TL;DR

This work proposes a novel approach for semantic segmentation, requiring the rough information of individual semantic class proportions, shortened as semantic proportions, rather than the necessity of ground-truth segmentation maps, which greatly simplifies the data annotation process.

Abstract

Semantic segmentation is a critical task in computer vision aiming to identify and classify individual pixels in an image, with numerous applications in for example autonomous driving and medical image analysis. However, semantic segmentation can be highly challenging particularly due to the need for large amounts of annotated data. Annotating images is a time-consuming and costly process, often requiring expert knowledge and significant effort; moreover, saving the annotated images could dramatically increase the storage space. In this paper, we propose a novel approach for semantic segmentation, requiring the rough information of individual semantic class proportions, shortened as semantic proportions, rather than the necessity of ground-truth segmentation maps. This greatly simplifies the data annotation process and thus will significantly reduce the annotation time, cost and storage space, opening up new possibilities for semantic segmentation tasks where obtaining the full ground-truth segmentation maps may not be feasible or practical. Our proposed method of utilising semantic proportions can (i) further be utilised as a booster in the presence of ground-truth segmentation maps to gain performance without extra data and model complexity, and (ii) also be seen as a parameter-free plug-and-play module, which can be attached to existing deep neural networks designed for semantic segmentation. Extensive experimental results demonstrate the good performance of our method compared to benchmark methods that rely on ground-truth segmentation maps. Utilising semantic proportions suggested in this work offers a promising direction for future semantic segmentation research.

Semantic Segmentation by Semantic Proportions

TL;DR

This work proposes a novel approach for semantic segmentation, requiring the rough information of individual semantic class proportions, shortened as semantic proportions, rather than the necessity of ground-truth segmentation maps, which greatly simplifies the data annotation process.

Abstract

Semantic segmentation is a critical task in computer vision aiming to identify and classify individual pixels in an image, with numerous applications in for example autonomous driving and medical image analysis. However, semantic segmentation can be highly challenging particularly due to the need for large amounts of annotated data. Annotating images is a time-consuming and costly process, often requiring expert knowledge and significant effort; moreover, saving the annotated images could dramatically increase the storage space. In this paper, we propose a novel approach for semantic segmentation, requiring the rough information of individual semantic class proportions, shortened as semantic proportions, rather than the necessity of ground-truth segmentation maps. This greatly simplifies the data annotation process and thus will significantly reduce the annotation time, cost and storage space, opening up new possibilities for semantic segmentation tasks where obtaining the full ground-truth segmentation maps may not be feasible or practical. Our proposed method of utilising semantic proportions can (i) further be utilised as a booster in the presence of ground-truth segmentation maps to gain performance without extra data and model complexity, and (ii) also be seen as a parameter-free plug-and-play module, which can be attached to existing deep neural networks designed for semantic segmentation. Extensive experimental results demonstrate the good performance of our method compared to benchmark methods that rely on ground-truth segmentation maps. Utilising semantic proportions suggested in this work offers a promising direction for future semantic segmentation research.
Paper Structure (21 sections, 5 equations, 9 figures, 7 tables)

This paper contains 21 sections, 5 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Difference between the proposed semantic segmentation approach and benchmark methods.
  • Figure 2: The SPSS (SP-based semantic segmentation) architecture. In the training stage, features are firstly extracted by a CNN from the input; and then the extracted features are through a GAP layer calculating the SP. After training using the loss function $\mathcal{L}_{\rm sp}$, the proposed SPSS architecture can force the extracted features to be the prediction of the class-wise segmentation masks.
  • Figure 3: The SPSS+ architecture ( cf. the SPSS architecture in Figure \ref{['fig:model1']}). In contrast, $\mathcal{L}_{\rm total}$ (see Eq. \ref{['eqn:loss-total']}), a weighted average of $\mathcal{L}_{\rm sp}$ and $\mathcal{L}_{\rm sm}$, is calculated during training. After training, the SPSS+ architecture can force the extracted features to be the prediction of the class-wise segmentation masks.
  • Figure 4: Example images and ground-truth segmentation masks of the three employed medical imaging datasets.
  • Figure 5: Diagrams of the proposed models SPSS and SPSS+ on the datasets Aerial Dubai (left) and Electronic Microscopy (right; significant class imbalance), respectively.
  • ...and 4 more figures