Multi-class Road Defect Detection and Segmentation using Spatial and Channel-wise Attention for Autonomous Road Repairing
Jongmin Yu, Chen Bene Chi, Sebastiano Fichera, Paolo Paoletti, Devansh Mehta, Shan Luo
TL;DR
The paper addresses the challenge of simultaneously detecting and segmenting multiple road defect classes in a unified end-to-end framework. It introduces SCM-MRCNN, a Mask-RCNN-based architecture augmented with Spatial and Channel-wise Multi-head Attention (SCM-attention) blocks to learn robust spatio-channel representations, enabling improved multi-class defect detection and segmentation. A new RoadEYE dataset with nine defect classes provides a benchmark for both bounding-box detection and pixel-level segmentation, and extensive experiments show state-of-the-art performance on RDD2020, CS datasets, and RoadEYE, with metrics such as $mAP$, $AP_M$, $AP_B$, and $AIU$ demonstrating gains. The work demonstrates that long-range dependencies in both spatial and channel dimensions enhance defect understanding, offering practical impact for autonomous road repair systems that require precise localization and segmentation to optimize repair material usage.
Abstract
Road pavement detection and segmentation are critical for developing autonomous road repair systems. However, developing an instance segmentation method that simultaneously performs multi-class defect detection and segmentation is challenging due to the textural simplicity of road pavement image, the diversity of defect geometries, and the morphological ambiguity between classes. We propose a novel end-to-end method for multi-class road defect detection and segmentation. The proposed method comprises multiple spatial and channel-wise attention blocks available to learn global representations across spatial and channel-wise dimensions. Through these attention blocks, more globally generalised representations of morphological information (spatial characteristics) of road defects and colour and depth information of images can be learned. To demonstrate the effectiveness of our framework, we conducted various ablation studies and comparisons with prior methods on a newly collected dataset annotated with nine road defect classes. The experiments show that our proposed method outperforms existing state-of-the-art methods for multi-class road defect detection and segmentation methods.
