Table of Contents
Fetching ...

Deep Learning for Segmentation of Cracks in High-Resolution Images of Steel Bridges

Andrii Kompanets, Gautam Pai, Remco Duits, Davide Leonetti, Bert Snijder

TL;DR

This work tackles automatic segmentation of fatigue cracks in high-resolution steel-bridge images captured by UAVs. It introduces a ConvNext-based encoder–decoder architecture with a specialized loss-inversion scheme and deep supervision to effectively train on patch-based, high-resolution data and reduce false positives. A new public dataset (CSB) of steel-bridge cracks, plus CSB patch variants, enables thorough evaluation of training strategies, background-patch effects, and annotation impact. Results show that background-patch-aware training and loss-inversion significantly improve performance, and patch size and global context substantially influence crack discrimination, with practical implications for UAV-aided bridge inspection workflows.

Abstract

Automating the current bridge visual inspection practices using drones and image processing techniques is a prominent way to make these inspections more effective, robust, and less expensive. In this paper, we investigate the development of a novel deep-learning method for the detection of fatigue cracks in high-resolution images of steel bridges. First, we present a novel and challenging dataset comprising of images of cracks in steel bridges. Secondly, we integrate the ConvNext neural network with a previous state-of-the-art encoder-decoder network for crack segmentation. We study and report, the effects of the use of background patches on the network performance when applied to high-resolution images of cracks in steel bridges. Finally, we introduce a loss function that allows the use of more background patches for the training process, which yields a significant reduction in false positive rates.

Deep Learning for Segmentation of Cracks in High-Resolution Images of Steel Bridges

TL;DR

This work tackles automatic segmentation of fatigue cracks in high-resolution steel-bridge images captured by UAVs. It introduces a ConvNext-based encoder–decoder architecture with a specialized loss-inversion scheme and deep supervision to effectively train on patch-based, high-resolution data and reduce false positives. A new public dataset (CSB) of steel-bridge cracks, plus CSB patch variants, enables thorough evaluation of training strategies, background-patch effects, and annotation impact. Results show that background-patch-aware training and loss-inversion significantly improve performance, and patch size and global context substantially influence crack discrimination, with practical implications for UAV-aided bridge inspection workflows.

Abstract

Automating the current bridge visual inspection practices using drones and image processing techniques is a prominent way to make these inspections more effective, robust, and less expensive. In this paper, we investigate the development of a novel deep-learning method for the detection of fatigue cracks in high-resolution images of steel bridges. First, we present a novel and challenging dataset comprising of images of cracks in steel bridges. Secondly, we integrate the ConvNext neural network with a previous state-of-the-art encoder-decoder network for crack segmentation. We study and report, the effects of the use of background patches on the network performance when applied to high-resolution images of cracks in steel bridges. Finally, we introduce a loss function that allows the use of more background patches for the training process, which yields a significant reduction in false positive rates.
Paper Structure (33 sections, 14 equations, 9 figures, 10 tables)

This paper contains 33 sections, 14 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Scheme of the proposed encoder-decoder network
  • Figure 2: Examples of images and ground truth crack segmentation from the CFD dataset
  • Figure 3: Examples of images and ground truth crack segmentation from the CSB dataset
  • Figure 4: Illustration of the datasets structure
  • Figure 5: Example of an image of size 4608x3456 pixels split into patches of size 512x512. Red squares represent background patches and green squares represent crack patches. Note: the bottom row of patches overlaps with the row above it
  • ...and 4 more figures