DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection

Weichao Pan; Jiaju Kang; Xu Wang; Zhihao Chen; Yiyuan Ge

DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection

Weichao Pan, Jiaju Kang, Xu Wang, Zhihao Chen, Yiyuan Ge

TL;DR

DAPONet is proposed, a model incorporating three key modules: a dual attention mechanism combining global and local attention, a multi-scale partial over-parameterization module, and an efficient downsampling module, to enhance real-time road damage detection using street view image data (SVRDD).

Abstract

Current road damage detection methods, relying on manual inspections or sensor-mounted vehicles, are inefficient, limited in coverage, and often inaccurate, especially for minor damages, leading to delays and safety hazards. To address these issues and enhance real-time road damage detection using street view image data (SVRDD), we propose DAPONet, a model incorporating three key modules: a dual attention mechanism combining global and local attention, a multi-scale partial over-parameterization module, and an efficient downsampling module. DAPONet achieves a mAP50 of 70.1% on the SVRDD dataset, outperforming YOLOv10n by 10.4%, while reducing parameters to 1.6M and FLOPs to 1.7G, representing reductions of 41% and 80%, respectively. On the MS COCO2017 val dataset, DAPONet achieves an mAP50-95 of 33.4%, 0.8% higher than EfficientDet-D1, with a 74% reduction in both parameters and FLOPs.

DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection

TL;DR

Abstract

Paper Structure (16 sections, 2 figures, 3 tables)

This paper contains 16 sections, 2 figures, 3 tables.

Introduction
Methods
Overview
Global Localization Context Attention (GLCA) Module
Cross Stage Partial Depthwise Over-parameterized Attention (CPDA) Module
Mix Convolutional Downsampling (MCD) Module
Experimental details
Datasets
Experimental environment
Evaluation metrics
Experimental results and discussion and analysis
Comparative experiments
Generalized object detection experiments
Ablation study
Error analysis
...and 1 more sections

Figures (2)

Figure 1: Shows the overall framework. In GLCA Module, GN is Group Normalization, CLR is Conv-Layer Normal-ReLU, and CB is Conv-Batch Normalization. In the PB Block of CPDA Module, a quarter of the input feature map is divided into DOConv channels, and the rest is directly concat with the feature map after DOConv operation.
Figure 2: Experimental models recognize visual results on the SVRDD dataset. Different models vary in detecting road damage. YOLOv5n struggles with minor cracks, resulting in higher miss rates. YOLOv8n improves accuracy for transverse cracks and manhole covers but still misses subtle damages. YOLOv9t prioritizes speed but loses precision in detecting finer details. YOLOv10n is effective for larger cracks but has more false negatives for smaller damages. DAPONet outperforms the other models, accurately detecting a broad range of damages, including fine cracks and manhole covers, with high confidence and fewer errors, demonstrating its robustness and precision in various road damage scenarios.

DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection

TL;DR

Abstract

DAPONet: A Dual Attention and Partially Overparameterized Network for Real-Time Road Damage Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (2)