Table of Contents
Fetching ...

COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes

Muhammad Ali, Mamoona Javaid, Mubashir Noman, Mustansar Fiaz, Salman Khan

TL;DR

This work introduces an efficacious segmentation network, named COSNet, that uses boundary cues along with multi-contextual information to accurately segment the objects in cluttered scenes to efficiently separate the recyclable objects from the waste.

Abstract

Automated waste recycling aims to efficiently separate the recyclable objects from the waste by employing vision-based systems. However, the presence of varying shaped objects having different material types makes it a challenging problem, especially in cluttered environments. Existing segmentation methods perform reasonably on many semantic segmentation datasets by employing multi-contextual representations, however, their performance is degraded when utilized for waste object segmentation in cluttered scenarios. In addition, plastic objects further increase the complexity of the problem due to their translucent nature. To address these limitations, we introduce an efficacious segmentation network, named COSNet, that uses boundary cues along with multi-contextual information to accurately segment the objects in cluttered scenes. COSNet introduces novel components including feature sharpening block (FSB) and boundary enhancement module (BEM) for enhancing the features and highlighting the boundary information of irregular waste objects in cluttered environment. Extensive experiments on three challenging datasets including ZeroWaste-f, SpectralWaste, and ADE20K demonstrate the effectiveness of the proposed method. Our COSNet achieves a significant gain of 1.8% on ZeroWaste-f and 2.1% on SpectralWaste datasets respectively in terms of mIoU metric.

COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes

TL;DR

This work introduces an efficacious segmentation network, named COSNet, that uses boundary cues along with multi-contextual information to accurately segment the objects in cluttered scenes to efficiently separate the recyclable objects from the waste.

Abstract

Automated waste recycling aims to efficiently separate the recyclable objects from the waste by employing vision-based systems. However, the presence of varying shaped objects having different material types makes it a challenging problem, especially in cluttered environments. Existing segmentation methods perform reasonably on many semantic segmentation datasets by employing multi-contextual representations, however, their performance is degraded when utilized for waste object segmentation in cluttered scenarios. In addition, plastic objects further increase the complexity of the problem due to their translucent nature. To address these limitations, we introduce an efficacious segmentation network, named COSNet, that uses boundary cues along with multi-contextual information to accurately segment the objects in cluttered scenes. COSNet introduces novel components including feature sharpening block (FSB) and boundary enhancement module (BEM) for enhancing the features and highlighting the boundary information of irregular waste objects in cluttered environment. Extensive experiments on three challenging datasets including ZeroWaste-f, SpectralWaste, and ADE20K demonstrate the effectiveness of the proposed method. Our COSNet achieves a significant gain of 1.8% on ZeroWaste-f and 2.1% on SpectralWaste datasets respectively in terms of mIoU metric.

Paper Structure

This paper contains 20 sections, 2 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustration of the overall architecture of the proposed segmentation framework named COSNet. The proposed framework utilizes enhanced backbone network to obtain rich multi-scale representations through FSBs (sec. \ref{['sssec:fsb']}) of the backbone network. The intermediate network further enhances the boundary details of the third-stage features by means of a boundary enhancement module (BEM). Finally, the multi-scale feature maps, $F_i$ where $i \in {1,2,3,4,5}$, are passed to the decoder to obtain the segmentation mask $M$.
  • Figure 2: Illustration of the backbone network of the proposed COSNet. The backbone network extracts multi-scale features at four scale levels by utilizing feature sharpening blocks (FSBs). The core component of FSB is multi-contextual features extraction and sharpening (MCFS) module that uses dilated convolutions to extract multi-contextual representations and a learnable sharpening module to enhance the boundary information.
  • Figure 3: Here we present the segmentation results of our COSNet on ZeroWaste-f bashkirova2022_zerowaste dataset. Our method can accurately segment the different waste material types compared to the DeepLabv3+ chen2018_deeplabv3_plus and recently introduced FANet Ali2024FANet as highlighted in yellow boxes.
  • Figure 4: Comparison of the segmentation results of proposed COSNet on Spectral Waste (RGB images) dataset. The proposed COSNet can accurately segment the different waste material types compared to the FANet as highlighted in yellow boxes.
  • Figure 5: Demonstration of the proposed modules of COSNet on ZeroWaste-f bashkirova2022_zerowaste dataset. (a) is the input image, (b)-(e) refer to features of the third stage for the rows (1-4) in Tab. \ref{['tab:ablation_on_zero_waste']}, respectively. (f) is the ground truth mask, (g)-(j) denote the corresponding prediction masks for rows (1-4) in Tab. \ref{['tab:ablation_on_zero_waste']}, respectively. The above results clearly indicate the efficacy of the proposed modules when integrated in the network.
  • ...and 2 more figures