Table of Contents
Fetching ...

Real-time High-Resolution Neural Network with Semantic Guidance for Crack Segmentation

Yongshang Li, Ronggui Ma, Han Liu, Gaoli Cheng

TL;DR

HrSegNet introduces a real-time, high-resolution crack segmentation architecture that preserves fine crack details through a controllable high-resolution path guided by a lightweight semantic stream. The model employs semantic guidance, a two-step segmentation head, and deep supervision to achieve high accuracy with low latency, scalable across multiple channel capacities. Extensive ablations demonstrate the benefits of the semantic guidance, fusion strategy, and supervision scheme, while experiments on OCD/RCD and CrackSeg9k show competitive mIoU and fast inference (e.g., 182 FPS for the smallest variant and 80.32% mIoU for a larger variant). The approach enables robust crack inspection on edge devices, combining detailed local information with contextual understanding for practical real-world deployment.

Abstract

Deep learning plays an important role in crack segmentation, but most work utilize off-the-shelf or improved models that have not been specifically developed for this task. High-resolution convolution neural networks that are sensitive to objects' location and detail help improve the performance of crack segmentation, yet conflict with real-time detection. This paper describes HrSegNet, a high-resolution network with semantic guidance specifically designed for crack segmentation, which guarantees real-time inference speed while preserving crack details. After evaluation on the composite dataset CrackSeg9k and the scenario-specific datasets Asphalt3k and Concrete3k, HrSegNet obtains state-of-the-art segmentation performance and efficiencies that far exceed those of the compared models. This approach demonstrates that there is a trade-off between high-resolution modeling and real-time detection, which fosters the use of edge devices to analyze cracks in real-world applications.

Real-time High-Resolution Neural Network with Semantic Guidance for Crack Segmentation

TL;DR

HrSegNet introduces a real-time, high-resolution crack segmentation architecture that preserves fine crack details through a controllable high-resolution path guided by a lightweight semantic stream. The model employs semantic guidance, a two-step segmentation head, and deep supervision to achieve high accuracy with low latency, scalable across multiple channel capacities. Extensive ablations demonstrate the benefits of the semantic guidance, fusion strategy, and supervision scheme, while experiments on OCD/RCD and CrackSeg9k show competitive mIoU and fast inference (e.g., 182 FPS for the smallest variant and 80.32% mIoU for a larger variant). The approach enables robust crack inspection on edge devices, combining detailed local information with contextual understanding for practical real-world deployment.

Abstract

Deep learning plays an important role in crack segmentation, but most work utilize off-the-shelf or improved models that have not been specifically developed for this task. High-resolution convolution neural networks that are sensitive to objects' location and detail help improve the performance of crack segmentation, yet conflict with real-time detection. This paper describes HrSegNet, a high-resolution network with semantic guidance specifically designed for crack segmentation, which guarantees real-time inference speed while preserving crack details. After evaluation on the composite dataset CrackSeg9k and the scenario-specific datasets Asphalt3k and Concrete3k, HrSegNet obtains state-of-the-art segmentation performance and efficiencies that far exceed those of the compared models. This approach demonstrates that there is a trade-off between high-resolution modeling and real-time detection, which fosters the use of edge devices to analyze cracks in real-world applications.
Paper Structure (24 sections, 4 equations, 9 figures, 5 tables)

This paper contains 24 sections, 4 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Automated inspection apparatuses and their data: (a) unmanned aerial vehicle; (b) road measurement vehicle.
  • Figure 2: The main body of the proposed HrSegNet.
  • Figure 3: Two examples of the HrSeg block. (a) illustrates the semantic-guided component within the HrSeg block, which maintains the same resolution as the high-resolution path but gradually decreases by a factor of 2 in subsequent blocks. (b) demonstrates another way to provide semantic guidance by gradually decreasing the spatial resolution of the semantic guidance within a HrSeg block.
  • Figure 4: (a) is the single-step segmentation head. (b) is the two-step segmentation head.
  • Figure 5: Deep supervision utilized in HrSegNet. Head 1 is a single-step segmentation head, whereas Head 2 is a double-step segmentation head.
  • ...and 4 more figures