Table of Contents
Fetching ...

DA-RAW: Domain Adaptive Object Detection for Real-World Adverse Weather Conditions

Minsik Jeon, Junwon Seo, Jihong Min

TL;DR

This work tackles robust object detection under real-world adverse weather by decomposing the domain gap into a style gap and a weather gap. It proposes a two-branch unsupervised domain adaptation framework built on Faster R-CNN with FPN: image-level style alignment using CBAM-enhanced features and a GRL-driven discriminator, and instance-level weather alignment via prototype-based contrastive learning with Sinkhorn-based soft assignments. By training with $\mathcal{L}_{\text{sup}}$, $\mathcal{L}_{\text{img}}$, and $\mathcal{L}_{\text{inst}}$, the method achieves superior performance on real rainy and snowy datasets without relying on synthetic weather priors or removal networks. The approach demonstrates that explicit separation of style and weather gaps and utilization of prototypes for weather-invariant representations significantly improve real-world detection robustness and generalization to diverse adverse conditions.

Abstract

Despite the success of deep learning-based object detection methods in recent years, it is still challenging to make the object detector reliable in adverse weather conditions such as rain and snow. For the robust performance of object detectors, unsupervised domain adaptation has been utilized to adapt the detection network trained on clear weather images to adverse weather images. While previous methods do not explicitly address weather corruption during adaptation, the domain gap between clear and adverse weather can be decomposed into two factors with distinct characteristics: a style gap and a weather gap. In this paper, we present an unsupervised domain adaptation framework for object detection that can more effectively adapt to real-world environments with adverse weather conditions by addressing these two gaps separately. Our method resolves the style gap by concentrating on style-related information of high-level features using an attention module. Using self-supervised contrastive learning, our framework then reduces the weather gap and acquires instance features that are robust to weather corruption. Extensive experiments demonstrate that our method outperforms other methods for object detection in adverse weather conditions.

DA-RAW: Domain Adaptive Object Detection for Real-World Adverse Weather Conditions

TL;DR

This work tackles robust object detection under real-world adverse weather by decomposing the domain gap into a style gap and a weather gap. It proposes a two-branch unsupervised domain adaptation framework built on Faster R-CNN with FPN: image-level style alignment using CBAM-enhanced features and a GRL-driven discriminator, and instance-level weather alignment via prototype-based contrastive learning with Sinkhorn-based soft assignments. By training with , , and , the method achieves superior performance on real rainy and snowy datasets without relying on synthetic weather priors or removal networks. The approach demonstrates that explicit separation of style and weather gaps and utilization of prototypes for weather-invariant representations significantly improve real-world detection robustness and generalization to diverse adverse conditions.

Abstract

Despite the success of deep learning-based object detection methods in recent years, it is still challenging to make the object detector reliable in adverse weather conditions such as rain and snow. For the robust performance of object detectors, unsupervised domain adaptation has been utilized to adapt the detection network trained on clear weather images to adverse weather images. While previous methods do not explicitly address weather corruption during adaptation, the domain gap between clear and adverse weather can be decomposed into two factors with distinct characteristics: a style gap and a weather gap. In this paper, we present an unsupervised domain adaptation framework for object detection that can more effectively adapt to real-world environments with adverse weather conditions by addressing these two gaps separately. Our method resolves the style gap by concentrating on style-related information of high-level features using an attention module. Using self-supervised contrastive learning, our framework then reduces the weather gap and acquires instance features that are robust to weather corruption. Extensive experiments demonstrate that our method outperforms other methods for object detection in adverse weather conditions.
Paper Structure (13 sections, 5 equations, 3 figures, 2 tables)

This paper contains 13 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overall pipeline of the proposed method. Faster R-CNN with an FPN backbone is adopted for a detection network. Image-level style alignment reduces the style gap by aligning the FPN's high-level features. During alignment, they focus on style-related features by incorporating CBAM and highlighting important spatial and channel details. Instance-level weather alignment uses instance embedding and its corresponding pseudo-label from RCN to establish a soft assignment for each feature to learnable class prototypes. Using multi-prototype-based contrastive learning, it resolves the weather gap and constructs a weather-resistant feature representation by increasing the similarity between an instance embedding and its assigned prototypes.
  • Figure 2: Qualitative results on real-world target datasets. Compared to other methods, Ours successfully detects the objects even in the presence of severe weather corruption and style variations. Even though MPRNet removes raindrops in the first two images, the detector performance remains low, indicating images generated by the removal network do not consistently help object detection. In the remaining images, the removal network fails to remove corruptions and instead creates some artifacts due to the disparity between real weather data and synthetic weather data, which MPRNet was trained on. While SWDA directly adapts the network to the target domain, it fails to detect objects under severe weather corruption and environmental differences. More qualitative results are available in our multimedia material.
  • Figure 3: Visualization of proposals assigned to each prototype in our rainy dataset. Each row displays the proposals whose instance embedding is highly similar to each car class prototype. Similar-shaped objects with diverse corruption and styles are assigned to identical prototypes, indicating the effectiveness of prototype-based contrastive learning. For example, the first row contains car proposals captured from a rear-view perspective and showing varying degrees of corruption.