Table of Contents
Fetching ...

Multi-temporal crack segmentation in concrete structures using deep learning approaches

Said Harb, Pedro Achanccaray, Mehdi Maboudi, Markus Gerke

TL;DR

This study probes whether multi-temporal data can improve semantic crack segmentation in concrete structures for structural health monitoring. It compares a Swin UNETR model trained on a multi-temporal crack-propagation dataset against a mono-temporal U-Net baseline, using 32-frame sequences and a deserialized mono-temporal reference. Results show that the multi-temporal approach yields higher segmentation accuracy and temporal consistency, with IoU and F1-score improvements (IoU around $82.72 ext{ extperthousand}$ and F1 around $90.54 ext{ extperthousand}$) while using roughly half the trainable parameters. The findings highlight the value of temporal context for crack detection and monitoring, offering a practical path toward more reliable long-term SHM of concrete infrastructure.

Abstract

Cracks are among the earliest indicators of deterioration in concrete structures. Early automatic detection of these cracks can significantly extend the lifespan of critical infrastructures, such as bridges, buildings, and tunnels, while simultaneously reducing maintenance costs and facilitating efficient structural health monitoring. This study investigates whether leveraging multi-temporal data for crack segmentation can enhance segmentation quality. Therefore, we compare a Swin UNETR trained on multi-temporal data with a U-Net trained on mono-temporal data to assess the effect of temporal information compared with conventional single-epoch approaches. To this end, a multi-temporal dataset comprising 1356 images, each with 32 sequential crack propagation images, was created. After training the models, experiments were conducted to analyze their generalization ability, temporal consistency, and segmentation quality. The multi-temporal approach consistently outperformed its mono-temporal counterpart, achieving an IoU of $82.72\%$ and a F1-score of $90.54\%$, representing a significant improvement over the mono-temporal model's IoU of $76.69\%$ and F1-score of $86.18\%$, despite requiring only half of the trainable parameters. The multi-temporal model also displayed a more consistent segmentation quality, with reduced noise and fewer errors. These results suggest that temporal information significantly enhances the performance of segmentation models, offering a promising solution for improved crack detection and the long-term monitoring of concrete structures, even with limited sequential data.

Multi-temporal crack segmentation in concrete structures using deep learning approaches

TL;DR

This study probes whether multi-temporal data can improve semantic crack segmentation in concrete structures for structural health monitoring. It compares a Swin UNETR model trained on a multi-temporal crack-propagation dataset against a mono-temporal U-Net baseline, using 32-frame sequences and a deserialized mono-temporal reference. Results show that the multi-temporal approach yields higher segmentation accuracy and temporal consistency, with IoU and F1-score improvements (IoU around and F1 around ) while using roughly half the trainable parameters. The findings highlight the value of temporal context for crack detection and monitoring, offering a practical path toward more reliable long-term SHM of concrete infrastructure.

Abstract

Cracks are among the earliest indicators of deterioration in concrete structures. Early automatic detection of these cracks can significantly extend the lifespan of critical infrastructures, such as bridges, buildings, and tunnels, while simultaneously reducing maintenance costs and facilitating efficient structural health monitoring. This study investigates whether leveraging multi-temporal data for crack segmentation can enhance segmentation quality. Therefore, we compare a Swin UNETR trained on multi-temporal data with a U-Net trained on mono-temporal data to assess the effect of temporal information compared with conventional single-epoch approaches. To this end, a multi-temporal dataset comprising 1356 images, each with 32 sequential crack propagation images, was created. After training the models, experiments were conducted to analyze their generalization ability, temporal consistency, and segmentation quality. The multi-temporal approach consistently outperformed its mono-temporal counterpart, achieving an IoU of and a F1-score of , representing a significant improvement over the mono-temporal model's IoU of and F1-score of , despite requiring only half of the trainable parameters. The multi-temporal model also displayed a more consistent segmentation quality, with reduced noise and fewer errors. These results suggest that temporal information significantly enhances the performance of segmentation models, offering a promising solution for improved crack detection and the long-term monitoring of concrete structures, even with limited sequential data.

Paper Structure

This paper contains 12 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of methods to analyze cracks.
  • Figure 2: Concrete block in the last epoch. This is the last stage of crack propagation on the concrete block. The preceding 24 images show the incremental crack propagation through time.
  • Figure 3: Swin UNETR architecture. A 3D Swin Transformer is used as a feature extractor, and in the decoder, the feature maps are concatenated and upsampled to the original input size. Adopted from hatamizadeh-2022.
  • Figure 4: Comparison of U-Net and Swin UNETR sequential predictions. The Swin UNETR produces smoother and more consistent predictions containing less noise than the U-Net.
  • Figure 5: Early stage vs. late stage crack segmentation. Both models perform well on early stages cracks except for thin and low contrast cracks, which are also seldom in the targets. Only the Swin UNETR manages to maintain adequate segmentation quality in later stages of crack propagation.
  • ...and 1 more figures