Multi-temporal crack segmentation in concrete structures using deep learning approaches
Said Harb, Pedro Achanccaray, Mehdi Maboudi, Markus Gerke
TL;DR
This study probes whether multi-temporal data can improve semantic crack segmentation in concrete structures for structural health monitoring. It compares a Swin UNETR model trained on a multi-temporal crack-propagation dataset against a mono-temporal U-Net baseline, using 32-frame sequences and a deserialized mono-temporal reference. Results show that the multi-temporal approach yields higher segmentation accuracy and temporal consistency, with IoU and F1-score improvements (IoU around $82.72 ext{ extperthousand}$ and F1 around $90.54 ext{ extperthousand}$) while using roughly half the trainable parameters. The findings highlight the value of temporal context for crack detection and monitoring, offering a practical path toward more reliable long-term SHM of concrete infrastructure.
Abstract
Cracks are among the earliest indicators of deterioration in concrete structures. Early automatic detection of these cracks can significantly extend the lifespan of critical infrastructures, such as bridges, buildings, and tunnels, while simultaneously reducing maintenance costs and facilitating efficient structural health monitoring. This study investigates whether leveraging multi-temporal data for crack segmentation can enhance segmentation quality. Therefore, we compare a Swin UNETR trained on multi-temporal data with a U-Net trained on mono-temporal data to assess the effect of temporal information compared with conventional single-epoch approaches. To this end, a multi-temporal dataset comprising 1356 images, each with 32 sequential crack propagation images, was created. After training the models, experiments were conducted to analyze their generalization ability, temporal consistency, and segmentation quality. The multi-temporal approach consistently outperformed its mono-temporal counterpart, achieving an IoU of $82.72\%$ and a F1-score of $90.54\%$, representing a significant improvement over the mono-temporal model's IoU of $76.69\%$ and F1-score of $86.18\%$, despite requiring only half of the trainable parameters. The multi-temporal model also displayed a more consistent segmentation quality, with reduced noise and fewer errors. These results suggest that temporal information significantly enhances the performance of segmentation models, offering a promising solution for improved crack detection and the long-term monitoring of concrete structures, even with limited sequential data.
