Fast and Accurate Object Detection on Asymmetrical Receptive Field

Tianhao Lin

Fast and Accurate Object Detection on Asymmetrical Receptive Field

Tianhao Lin

TL;DR

This work tackles the accuracy-speed trade-off in object detection by rethinking the final receptive-field geometry in YOLOv5. It introduces a nine-map head produced via asymmetrical pooling, along with new anchors and a multi-pass NMS strategy to better handle objects of varying shapes. Empirical results on COCO show modest $mAP$ gains with limited impact on speed, and experiments demonstrate benefits for both square and rectangular objects. The approach offers practical improvements for real-time detection scenarios and points to future refinements in backbone/neck design and autonomous driving applications.

Abstract

Object detection has been used in a wide range of industries. For example, in autonomous driving, the task of object detection is to accurately and efficiently identify and locate a large number of predefined classes of object instances (vehicles, pedestrians, traffic signs, etc.) from videos of roads. In robotics, the industry robot needs to recognize specific machine elements. In the security field, the camera should accurately recognize each face of people. With the wide application of deep learning, the accuracy and efficiency of object detection have been greatly improved, but object detection based on deep learning still faces challenges. Different applications of object detection have different requirements, including highly accurate detection, multi-category object detection, real-time detection, robustness to occlusions, etc. To address the above challenges, based on extensive literature research, this paper analyzes methods for improving and optimizing mainstream object detection algorithms from the perspective of evolution of one-stage and two-stage object detection algorithms. Furthermore, this article proposes methods for improving object detection accuracy from the perspective of changing receptive fields. The new model is based on the original YOLOv5 (You Look Only Once) with some modifications. The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers. As a result, the accuracy of the algorithm is improved while ensuring the speed. The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters. And the evaluation of the new model is presented in four situations. Moreover, the summary and outlooks are made on the problems to be solved and the research directions in the future.

Fast and Accurate Object Detection on Asymmetrical Receptive Field

TL;DR

gains with limited impact on speed, and experiments demonstrate benefits for both square and rectangular objects. The approach offers practical improvements for real-time detection scenarios and points to future refinements in backbone/neck design and autonomous driving applications.

Abstract

Paper Structure (20 sections, 3 equations, 7 figures, 7 tables)

This paper contains 20 sections, 3 equations, 7 figures, 7 tables.

Introduction
Related Work
Development of YOLO
COCO Dataset
Metrics
Proposed Methodology
Architecture of YOLOv5
New Head
New Head
New Anchors
New Strategy of NMS
Experiment
Configuration
Training Hyperparameter
Evaluations
...and 5 more sections

Figures (7)

Figure 1: Architecture of YOLOv5. The whole network is composed of Backbone, Neck and Head.
Figure 2: Details of each component in the YOLOv5 backbone.
Figure 3: The new head detector outputs 9 feature maps. For the input, it will be divided into three types of processing: Conv, Conv + (1,2) pooling and Conv + (2,1) pooling.
Figure 4: PR-curves of original model and square-anchor model. The mAP@0.5 of the original model is 0.202 and the mAP@0.5 of square-anchors model is 0.206. mAP@0.5 means the threshold of IoU is 0.5.
Figure 5: PR-curves of original model and (1, 2) pooling model. The mAP@0.5 of the original model is 0.204 and the mAP@0.5 of square-anchors model is 0.224.
...and 2 more figures

Fast and Accurate Object Detection on Asymmetrical Receptive Field

TL;DR

Abstract

Fast and Accurate Object Detection on Asymmetrical Receptive Field

Authors

TL;DR

Abstract

Table of Contents

Figures (7)