Table of Contents
Fetching ...

Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection

Guoxuan Mao, Ting Cao, Ziyang Li, Yuan Dong

TL;DR

This work tackles segmentation consistency for fixed components in industrial image inspection under real-time constraints. It introduces SPENet, a shape-aware two-path network employing a Spatial Path for shape extraction, a decoupled body–edge module, and a Variable Boundary Domain to supervise boundary information. A novel Consistency Mean Square Error ($CMSE$) metric is proposed to quantify segmentation consistency for fixed parts. On the PVC dataset, SPENet achieves state-of-the-art mIoU and CMSE performance with real-time speed, and shows competitive results on CityScapes, highlighting practical impact for industrial QA and potential transfer to other industrial vision tasks.

Abstract

Semantic segmentation stands as a pivotal research focus in computer vision. In the context of industrial image inspection, conventional semantic segmentation models fail to maintain the segmentation consistency of fixed components across varying contextual environments due to a lack of perception of object contours. Given the real-time constraints and limited computing capability of industrial image detection machines, it is also necessary to create efficient models to reduce computational complexity. In this work, a Shape-Aware Efficient Network (SPENet) is proposed, which focuses on the shapes of objects to achieve excellent segmentation consistency by separately supervising the extraction of boundary and body information from images. In SPENet, a novel method is introduced for describing fuzzy boundaries to better adapt to real-world scenarios named Variable Boundary Domain (VBD). Additionally, a new metric, Consistency Mean Square Error(CMSE), is proposed to measure segmentation consistency for fixed components. Our approach attains the best segmentation accuracy and competitive speed on our dataset, showcasing significant advantages in CMSE among numerous state-of-the-art real-time segmentation networks, achieving a reduction of over 50% compared to the previously top-performing models.

Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection

TL;DR

This work tackles segmentation consistency for fixed components in industrial image inspection under real-time constraints. It introduces SPENet, a shape-aware two-path network employing a Spatial Path for shape extraction, a decoupled body–edge module, and a Variable Boundary Domain to supervise boundary information. A novel Consistency Mean Square Error () metric is proposed to quantify segmentation consistency for fixed parts. On the PVC dataset, SPENet achieves state-of-the-art mIoU and CMSE performance with real-time speed, and shows competitive results on CityScapes, highlighting practical impact for industrial QA and potential transfer to other industrial vision tasks.

Abstract

Semantic segmentation stands as a pivotal research focus in computer vision. In the context of industrial image inspection, conventional semantic segmentation models fail to maintain the segmentation consistency of fixed components across varying contextual environments due to a lack of perception of object contours. Given the real-time constraints and limited computing capability of industrial image detection machines, it is also necessary to create efficient models to reduce computational complexity. In this work, a Shape-Aware Efficient Network (SPENet) is proposed, which focuses on the shapes of objects to achieve excellent segmentation consistency by separately supervising the extraction of boundary and body information from images. In SPENet, a novel method is introduced for describing fuzzy boundaries to better adapt to real-world scenarios named Variable Boundary Domain (VBD). Additionally, a new metric, Consistency Mean Square Error(CMSE), is proposed to measure segmentation consistency for fixed components. Our approach attains the best segmentation accuracy and competitive speed on our dataset, showcasing significant advantages in CMSE among numerous state-of-the-art real-time segmentation networks, achieving a reduction of over 50% compared to the previously top-performing models.

Paper Structure

This paper contains 17 sections, 5 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Overall structure of our SPENet, the detailed information of "SSP" is shown in Fig. \ref{['fig2']}. We annotated the resolution size of the critical intermediate features relative to the input image. The "Decoupled Module" is a method proposed in li2020improving for separating body and edge information.
  • Figure 2: The architecture of "SSP" in Fig. \ref{['fig1']}. Channel-wise Attention is the Squeeze-and-Excitation process in hu2018squeeze. "a" and "b" correspond to the "a" and "b" in Fig. \ref{['fig1']}. The "Start" utilizes convolution with the stride of 2 combined with MaxPooling in paszke2016enet.
  • Figure 3: The detail of ASPP and ACP in Fig \ref{['fig2']}, all the dilated and asymmetry convolutions are depth-wise separable convolutions.
  • Figure 4: Example of PVC dataset
  • Figure 5: Visualized segmentation results on PVC dataset
  • ...and 1 more figures