YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification
Lin Huang, Weisheng Li, Yujuan Tan, Linlin Shen, Jing Yu, Haojie Fu
TL;DR
The paper addresses the challenge of balancing accuracy and real-time speed in single-stage object detectors by introducing two modules, Dense Channel Compression for Feature Spatial Solidification (DF) and Asymmetric Multi-Level Channel Compression Decoupled Head (ADH), integrated into a YOLOv5 baseline to form YOLOCS. DF improves feature purification and gradient flow in the backbone/neck by progressively compressing channels and expanding receptive fields, while ADH optimizes the head by using an asymmetric, multi-path design that emphasizes the objectness branch and separates classification from regression. Together, DF and ADH yield consistent AP gains across large, medium, and small models on MS-COCO, with competitive inference speeds, and achieve state-of-the-art results compared to several SOTA detectors. The work advances real-time object detection by enhancing forward-backward information propagation and reducing feature loss during channel compression, offering practical gains for deployment on diverse hardware. The contributions include a novel backbone (DF) and a novel decoupled head (ADH), validated by ablations and extensive comparisons on COCO benchmarks.
Abstract
In this study, we examine the associations between channel features and convolutional kernels during the processes of feature purification and gradient backpropagation, with a focus on the forward and backward propagation within the network. Consequently, we propose a method called Dense Channel Compression for Feature Spatial Solidification. Drawing upon the central concept of this method, we introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH). When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS. Evaluated on the MSCOCO dataset, the large, medium, and small YOLOCS models yield AP of 50.1%, 47.6%, and 42.5%, respectively. Maintaining inference speeds remarkably similar to those of the YOLOv5 model, the large, medium, and small YOLOCS models surpass the YOLOv5 model's AP by 1.1%, 2.3%, and 5.2%, respectively.
