Table of Contents
Fetching ...

YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification

Lin Huang, Weisheng Li, Yujuan Tan, Linlin Shen, Jing Yu, Haojie Fu

TL;DR

The paper addresses the challenge of balancing accuracy and real-time speed in single-stage object detectors by introducing two modules, Dense Channel Compression for Feature Spatial Solidification (DF) and Asymmetric Multi-Level Channel Compression Decoupled Head (ADH), integrated into a YOLOv5 baseline to form YOLOCS. DF improves feature purification and gradient flow in the backbone/neck by progressively compressing channels and expanding receptive fields, while ADH optimizes the head by using an asymmetric, multi-path design that emphasizes the objectness branch and separates classification from regression. Together, DF and ADH yield consistent AP gains across large, medium, and small models on MS-COCO, with competitive inference speeds, and achieve state-of-the-art results compared to several SOTA detectors. The work advances real-time object detection by enhancing forward-backward information propagation and reducing feature loss during channel compression, offering practical gains for deployment on diverse hardware. The contributions include a novel backbone (DF) and a novel decoupled head (ADH), validated by ablations and extensive comparisons on COCO benchmarks.

Abstract

In this study, we examine the associations between channel features and convolutional kernels during the processes of feature purification and gradient backpropagation, with a focus on the forward and backward propagation within the network. Consequently, we propose a method called Dense Channel Compression for Feature Spatial Solidification. Drawing upon the central concept of this method, we introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH). When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS. Evaluated on the MSCOCO dataset, the large, medium, and small YOLOCS models yield AP of 50.1%, 47.6%, and 42.5%, respectively. Maintaining inference speeds remarkably similar to those of the YOLOv5 model, the large, medium, and small YOLOCS models surpass the YOLOv5 model's AP by 1.1%, 2.3%, and 5.2%, respectively.

YOLOCS: Object Detection based on Dense Channel Compression for Feature Spatial Solidification

TL;DR

The paper addresses the challenge of balancing accuracy and real-time speed in single-stage object detectors by introducing two modules, Dense Channel Compression for Feature Spatial Solidification (DF) and Asymmetric Multi-Level Channel Compression Decoupled Head (ADH), integrated into a YOLOv5 baseline to form YOLOCS. DF improves feature purification and gradient flow in the backbone/neck by progressively compressing channels and expanding receptive fields, while ADH optimizes the head by using an asymmetric, multi-path design that emphasizes the objectness branch and separates classification from regression. Together, DF and ADH yield consistent AP gains across large, medium, and small models on MS-COCO, with competitive inference speeds, and achieve state-of-the-art results compared to several SOTA detectors. The work advances real-time object detection by enhancing forward-backward information propagation and reducing feature loss during channel compression, offering practical gains for deployment on diverse hardware. The contributions include a novel backbone (DF) and a novel decoupled head (ADH), validated by ablations and extensive comparisons on COCO benchmarks.

Abstract

In this study, we examine the associations between channel features and convolutional kernels during the processes of feature purification and gradient backpropagation, with a focus on the forward and backward propagation within the network. Consequently, we propose a method called Dense Channel Compression for Feature Spatial Solidification. Drawing upon the central concept of this method, we introduce two innovative modules for backbone and head networks: the Dense Channel Compression for Feature Spatial Solidification Structure (DCFS) and the Asymmetric Multi-Level Compression Decoupled Head (ADH). When integrated into the YOLOv5 model, these two modules demonstrate exceptional performance, resulting in a modified model referred to as YOLOCS. Evaluated on the MSCOCO dataset, the large, medium, and small YOLOCS models yield AP of 50.1%, 47.6%, and 42.5%, respectively. Maintaining inference speeds remarkably similar to those of the YOLOv5 model, the large, medium, and small YOLOCS models surpass the YOLOv5 model's AP by 1.1%, 2.3%, and 5.2%, respectively.
Paper Structure (13 sections, 5 equations, 5 figures, 6 tables)

This paper contains 13 sections, 5 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Comparison of the proposed YOLOCS with YOLOv5-R6.1 and YOLOX.
  • Figure 2: Bottleneck structures in backbone networks within different YOLO network architectures.
  • Figure 3: Analyzing the differences in features and errors during the forward and backward propagation processes for channel compression, when utilizing convolution kernels of various sizesnc. The orange dashed line and the label "Rotate 180$^{\circ}$" indicate that the convolution kernel used in the forward propagation is transformed into the convolution kernel used in error backpropagation by rotating it 180 degrees.
  • Figure 4: The dense channel compression for feature Spatial Solidification architecture is fully incorporated into the backbone network, with the shortcut component removed from the neck structure.
  • Figure 5: Comparison between YOLOCS's Asymmetric Multi-level Channel Compression Decoupled Head and YOLOX's Decoupled Head