Table of Contents
Fetching ...

D-YOLO a robust framework for object detection in adverse weather conditions

Zihan Chu

TL;DR

This work designed a double-route network with an attention feature fusion module, taking both hazy and dehazed features into consideration, and proposed a subnetwork to provide haze-free features to the detection network.

Abstract

Adverse weather conditions including haze, snow and rain lead to decline in image qualities, which often causes a decline in performance for deep-learning based detection networks. Most existing approaches attempts to rectify hazy images before performing object detection, which increases the complexity of the network and may result in the loss in latent information. To better integrate image restoration and object detection tasks, we designed a double-route network with an attention feature fusion module, taking both hazy and dehazed features into consideration. We also proposed a subnetwork to provide haze-free features to the detection network. Specifically, our D-YOLO improves the performance of the detection network by minimizing the distance between the clear feature extraction subnetwork and detection network. Experiments on RTTS and FoggyCityscapes datasets show that D-YOLO demonstrates better performance compared to the state-of-the-art methods. It is a robust detection framework for bridging the gap between low-level dehazing and high-level detection.

D-YOLO a robust framework for object detection in adverse weather conditions

TL;DR

This work designed a double-route network with an attention feature fusion module, taking both hazy and dehazed features into consideration, and proposed a subnetwork to provide haze-free features to the detection network.

Abstract

Adverse weather conditions including haze, snow and rain lead to decline in image qualities, which often causes a decline in performance for deep-learning based detection networks. Most existing approaches attempts to rectify hazy images before performing object detection, which increases the complexity of the network and may result in the loss in latent information. To better integrate image restoration and object detection tasks, we designed a double-route network with an attention feature fusion module, taking both hazy and dehazed features into consideration. We also proposed a subnetwork to provide haze-free features to the detection network. Specifically, our D-YOLO improves the performance of the detection network by minimizing the distance between the clear feature extraction subnetwork and detection network. Experiments on RTTS and FoggyCityscapes datasets show that D-YOLO demonstrates better performance compared to the state-of-the-art methods. It is a robust detection framework for bridging the gap between low-level dehazing and high-level detection.
Paper Structure (26 sections, 6 equations, 8 figures, 10 tables)

This paper contains 26 sections, 6 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Current methods for object detection in adverse weather conditions.(1) Dehazing and detection are performed sequentially. Dehazing models are first trained with synthetic hazy dataset, then the dehazed images are sent to detection networks for object detection. (2) Dehazing and detection tasks are jointly performed in a single network. (3) Detection models are directly trained on hazy dataset.
  • Figure 2: The architecture of our proposed D-YOLO. It is composed of a Clear feature extraction subnetwork and a detection network. D-YOLO also adopts a dual-branch structure. One is for yielding dehazed features via feature adaption and the other preserves hazy features. In addition, an attention feature fusion module is introduced to combine different features, after which the fused features are sent to the detection head for bounding box prediction. $F_c$ stands for clear features from the clear feature extraction subnetwork, $F_d$ stands for dehazed features and $F_h$ stands for fused features.
  • Figure 3: The archietecture of Omni-dimensional dynamic convolution. ODConv adopts multi-dimensional attention mechanisms on different convolution, which can build positive dependencies around every element, enhancing the feature extraction, feature transfer ability of the network, resulting in better performance for our feature adaption module.
  • Figure 4: Illustration of four types of attentions in ODConv. (a) Location-wise multiplication, (b) channel-wise multiplication, (c) filter-wise multiplication, (d) kernel-wise multiplication.
  • Figure 5: Archietecture of our proposed attention feature fusion module. As we can see, in our attention feature fusion module, features are fused through attention calibrated convolutions.
  • ...and 3 more figures