Real-time High-Resolution Neural Network with Semantic Guidance for Crack Segmentation
Yongshang Li, Ronggui Ma, Han Liu, Gaoli Cheng
TL;DR
HrSegNet introduces a real-time, high-resolution crack segmentation architecture that preserves fine crack details through a controllable high-resolution path guided by a lightweight semantic stream. The model employs semantic guidance, a two-step segmentation head, and deep supervision to achieve high accuracy with low latency, scalable across multiple channel capacities. Extensive ablations demonstrate the benefits of the semantic guidance, fusion strategy, and supervision scheme, while experiments on OCD/RCD and CrackSeg9k show competitive mIoU and fast inference (e.g., 182 FPS for the smallest variant and 80.32% mIoU for a larger variant). The approach enables robust crack inspection on edge devices, combining detailed local information with contextual understanding for practical real-world deployment.
Abstract
Deep learning plays an important role in crack segmentation, but most work utilize off-the-shelf or improved models that have not been specifically developed for this task. High-resolution convolution neural networks that are sensitive to objects' location and detail help improve the performance of crack segmentation, yet conflict with real-time detection. This paper describes HrSegNet, a high-resolution network with semantic guidance specifically designed for crack segmentation, which guarantees real-time inference speed while preserving crack details. After evaluation on the composite dataset CrackSeg9k and the scenario-specific datasets Asphalt3k and Concrete3k, HrSegNet obtains state-of-the-art segmentation performance and efficiencies that far exceed those of the compared models. This approach demonstrates that there is a trade-off between high-resolution modeling and real-time detection, which fosters the use of edge devices to analyze cracks in real-world applications.
