Table of Contents
Fetching ...

Weather-Aware Object Detection Transformer for Domain Adaptation

Soheil Gharatappeh, Salimeh Sekeh, Vikas Dhiman

TL;DR

The paper tackles the fragility of real-time DETR-based detectors under fog by proposing three weather-aware strategies: (i) a perceptual-loss based perceptual-domain adaptation via a teacher–student framework with $L=\mathcal{L}_{obj}+\mathcal{L}_{perc}$, (ii) Fog-aware Weather Adaptive Attention that scales attention with a fog proxy $V_w$, and (iii) a Weather Fusion Encoder that fuses clear and foggy streams through cross-attention. Across experiments on VOC, RTTS, and synthetic fog, the perceptual-loss approach yields consistent cross-domain improvements over baselines, while the dual-stream fusion matches baseline performance and the attention method exhibits training instability and non-convergence in some settings. The results provide valuable insights into the challenges of transferring transformer-based detectors to adverse weather and highlight stability considerations for attention modulation and cross-domain feature fusion. These findings guide future work toward robust, weather-aware object detection with improved training dynamics and broader weather condition coverage.

Abstract

RT-DETRs have shown strong performance across various computer vision tasks but are known to degrade under challenging weather conditions such as fog. In this work, we investigate three novel approaches to enhance RT-DETR robustness in foggy environments: (1) Domain Adaptation via Perceptual Loss, which distills domain-invariant features from a teacher network to a student using perceptual supervision; (2) Weather Adaptive Attention, which augments the attention mechanism with fog-sensitive scaling by introducing an auxiliary foggy image stream; and (3) Weather Fusion Encoder, which integrates a dual-stream encoder architecture that fuses clear and foggy image features via multi-head self and cross-attention. Despite the architectural innovations, none of the proposed methods consistently outperform the baseline RT-DETR. We analyze the limitations and potential causes, offering insights for future research in weather-aware object detection.

Weather-Aware Object Detection Transformer for Domain Adaptation

TL;DR

The paper tackles the fragility of real-time DETR-based detectors under fog by proposing three weather-aware strategies: (i) a perceptual-loss based perceptual-domain adaptation via a teacher–student framework with , (ii) Fog-aware Weather Adaptive Attention that scales attention with a fog proxy , and (iii) a Weather Fusion Encoder that fuses clear and foggy streams through cross-attention. Across experiments on VOC, RTTS, and synthetic fog, the perceptual-loss approach yields consistent cross-domain improvements over baselines, while the dual-stream fusion matches baseline performance and the attention method exhibits training instability and non-convergence in some settings. The results provide valuable insights into the challenges of transferring transformer-based detectors to adverse weather and highlight stability considerations for attention modulation and cross-domain feature fusion. These findings guide future work toward robust, weather-aware object detection with improved training dynamics and broader weather condition coverage.

Abstract

RT-DETRs have shown strong performance across various computer vision tasks but are known to degrade under challenging weather conditions such as fog. In this work, we investigate three novel approaches to enhance RT-DETR robustness in foggy environments: (1) Domain Adaptation via Perceptual Loss, which distills domain-invariant features from a teacher network to a student using perceptual supervision; (2) Weather Adaptive Attention, which augments the attention mechanism with fog-sensitive scaling by introducing an auxiliary foggy image stream; and (3) Weather Fusion Encoder, which integrates a dual-stream encoder architecture that fuses clear and foggy image features via multi-head self and cross-attention. Despite the architectural innovations, none of the proposed methods consistently outperform the baseline RT-DETR. We analyze the limitations and potential causes, offering insights for future research in weather-aware object detection.

Paper Structure

This paper contains 19 sections, 11 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: RT-DETR Perceptual Loss
  • Figure 2: Dual Setl-Attention + Cross-Attention Fusion
  • Figure 3: Fog-Aware Attention