Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads
Istiaq Ahmed Fahad, Abdullah Ibne Hanif Arean, Nazmus Sakib Ahmed, Mahmudul Hasan
TL;DR
This work addresses automatic vehicle detection in diverse driving environments by applying a Transformer-based DETR approach augmented with Collaborative Hybrid Assignments Training (Co-DETR) to the BadODD dataset from Bangladesh. It compares Co-DETR against YOLOv8m, showing that the transformer-based method yields higher detection accuracy (peak mAP of 0.438 at 9 epochs) and better robustness on treacherous roads. The study details dataset characteristics, preprocessing, and model configurations, and demonstrates the practical potential of DETR for autonomous navigation in complex real-world settings. The findings suggest transformer-based detection with Co-DETR as a viable path for reliable AVD, with future work targeting real-time deployment and hybrid architectures.
Abstract
Automatic Vehicle Detection (AVD) in diverse driving environments presents unique challenges due to varying lighting conditions, road types, and vehicle types. Traditional methods, such as YOLO and Faster R-CNN, often struggle to cope with these complexities. As computer vision evolves, combining Convolutional Neural Networks (CNNs) with Transformer-based approaches offers promising opportunities for improving detection accuracy and efficiency. This study is the first to experiment with Detection Transformer (DETR) for automatic vehicle detection in complex and varied settings. We employ a Collaborative Hybrid Assignments Training scheme, Co-DETR, to enhance feature learning and attention mechanisms in DETR. By leveraging versatile label assignment strategies and introducing multiple parallel auxiliary heads, we provide more effective supervision during training and extract positive coordinates to boost training efficiency. Through extensive experiments on DETR variants and YOLO models, conducted using the BadODD dataset, we demonstrate the advantages of our approach. Our method achieves superior results, and improved accuracy in diverse conditions, making it practical for real-world deployment. This work significantly advances autonomous navigation technology and opens new research avenues in object detection for autonomous vehicles. By integrating the strengths of CNNs and Transformers, we highlight the potential of DETR for robust and efficient vehicle detection in challenging driving environments.
