Table of Contents
Fetching ...

NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes

Cidan Shi, Lihuang Fang, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin

TL;DR

This work tackles nighttime rainy driving perception by introducing Cross-View Cooperative Learning (CVCL), a two-stage framework that first cleans rain from visible images (CleanNet) and then fuses the cleaned visible data with infrared imagery (FusionNet) to produce high-quality de-rained outputs. CleanNet uses a CNN-Transformer hybrid with AgMoE, Top-selection Self Attention, and Mutual Deformable FFN to robustly remove rain while preserving texture; FusionNet employs information-measure-guided fusion and a cascaded refinement loop to enhance contrast and visual fidelity. Extensive RoadScene-rain experiments show CVCL achieving superior quantitative metrics (PSNR, SSIM, MS-SSIM, FMI variants, Qabf) and qualitative results over state-of-the-art deraining and fusion methods, validating the benefits of cross-view sensor cooperation under challenging low-light rain. The proposed approach advances nighttime autonomous driving perception by leveraging multi-modal data to restore scene details, targets, and contrast, with implications for safer perception and downstream vision tasks in adverse weather. Future work will focus on broader real-world data, real-time optimization, and robustness for dynamic targets.

Abstract

In real-world environments, outdoor imaging systems are often affected by disturbances such as rain degradation. Especially, in nighttime driving scenes, insufficient and uneven lighting shrouds the scenes in darkness, resulting degradation of both the image quality and visibility. Particularly, in the field of autonomous driving, the visual perception ability of RGB sensors experiences a sharp decline in such harsh scenarios. Additionally, driving assistance systems suffer from reduced capabilities in capturing and discerning the surrounding environment, posing a threat to driving safety. Single-view information captured by single-modal sensors cannot comprehensively depict the entire scene. To address these challenges, we developed an image de-raining framework tailored for rainy nighttime driving scenes. It aims to remove rain artifacts, enrich scene representation, and restore useful information. Specifically, we introduce cooperative learning between visible and infrared images captured by different sensors. By cross-view fusion of these multi-source data, the scene within the images gains richer texture details and enhanced contrast. We constructed an information cleaning module called CleanNet as the first stage of our framework. Moreover, we designed an information fusion module called FusionNet as the second stage to fuse the clean visible images with infrared images. Using this stage-by-stage learning strategy, we obtain de-rained fusion images with higher quality and better visual perception. Extensive experiments demonstrate the effectiveness of our proposed Cross-View Cooperative Learning (CVCL) in adverse driving scenarios in low-light rainy environments. The proposed approach addresses the gap in the utilization of existing rain removal algorithms in specific low-light conditions.

NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes

TL;DR

This work tackles nighttime rainy driving perception by introducing Cross-View Cooperative Learning (CVCL), a two-stage framework that first cleans rain from visible images (CleanNet) and then fuses the cleaned visible data with infrared imagery (FusionNet) to produce high-quality de-rained outputs. CleanNet uses a CNN-Transformer hybrid with AgMoE, Top-selection Self Attention, and Mutual Deformable FFN to robustly remove rain while preserving texture; FusionNet employs information-measure-guided fusion and a cascaded refinement loop to enhance contrast and visual fidelity. Extensive RoadScene-rain experiments show CVCL achieving superior quantitative metrics (PSNR, SSIM, MS-SSIM, FMI variants, Qabf) and qualitative results over state-of-the-art deraining and fusion methods, validating the benefits of cross-view sensor cooperation under challenging low-light rain. The proposed approach advances nighttime autonomous driving perception by leveraging multi-modal data to restore scene details, targets, and contrast, with implications for safer perception and downstream vision tasks in adverse weather. Future work will focus on broader real-world data, real-time optimization, and robustness for dynamic targets.

Abstract

In real-world environments, outdoor imaging systems are often affected by disturbances such as rain degradation. Especially, in nighttime driving scenes, insufficient and uneven lighting shrouds the scenes in darkness, resulting degradation of both the image quality and visibility. Particularly, in the field of autonomous driving, the visual perception ability of RGB sensors experiences a sharp decline in such harsh scenarios. Additionally, driving assistance systems suffer from reduced capabilities in capturing and discerning the surrounding environment, posing a threat to driving safety. Single-view information captured by single-modal sensors cannot comprehensively depict the entire scene. To address these challenges, we developed an image de-raining framework tailored for rainy nighttime driving scenes. It aims to remove rain artifacts, enrich scene representation, and restore useful information. Specifically, we introduce cooperative learning between visible and infrared images captured by different sensors. By cross-view fusion of these multi-source data, the scene within the images gains richer texture details and enhanced contrast. We constructed an information cleaning module called CleanNet as the first stage of our framework. Moreover, we designed an information fusion module called FusionNet as the second stage to fuse the clean visible images with infrared images. Using this stage-by-stage learning strategy, we obtain de-rained fusion images with higher quality and better visual perception. Extensive experiments demonstrate the effectiveness of our proposed Cross-View Cooperative Learning (CVCL) in adverse driving scenarios in low-light rainy environments. The proposed approach addresses the gap in the utilization of existing rain removal algorithms in specific low-light conditions.
Paper Structure (33 sections, 14 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 33 sections, 14 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Visual comparison of five fusion methods in rainy conditions. As all fusion methods are easily violated by rainfall degradation, it is necessary to investigate an effective nighttime de-raining method for real-world applications. Best viewed by zooming in the electronic version.
  • Figure 2: Performance of existing de-raining methods in challenging rainy and low-light conditions. Image de-raining methods typically adopt a single sensor for perception. However, cross-sensor data has the potential to realise superior image de-raining in challenging conditions. To this end, image de-raining with multi-modal sensor data is essential to advance safe driver assistant systems. Best viewed by zooming in the electronic version.
  • Figure 3: Formerly, image fusion methods assume all data is clear and ignore the rainfall degradation in the real-world condition.
  • Figure 4: With cross-view sensor fusion network, our algorithm efficiently removes rainwater, revealing detailed textures and prominent targets in low-light conditions. Best viewed by zooming in the electronic version.
  • Figure 5: The overall framework of the proposed CVCL for nighttime image de-raining. Cross-view CleanNet comprises the first stage network, while cross-view FusionNet comprises the second stage network. CleanNet reconstructs high-quality de-rained results. FusionNet fully integrates the clean-visible and infrared images to generate final clean-fusion results. The symbol $\oplus$ represents the sum operation.
  • ...and 6 more figures