Table of Contents
Fetching ...

Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance

Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, Jian Yang

TL;DR

This work tackles the challenge of real-world driving-video dehazing where hazy/clear frames are not precisely aligned. It introduces a non-aligned regularization framework with NRFM to select non-aligned reference frames, and a video dehazing network that uses Flow-guided Cosine Attention Sampler (FCAS) and Deformable Cosine Attention Fusion (DCAF) to robustly align and fuse multi-frame information, aided by a pre-dehazing step. The approach is validated on the GoProHazy and DrivingHazy real-world datasets, achieving state-of-the-art FADE and NIQE scores and demonstrating strong generalization to InternetHazy and indoor REVIDE data, with ablations confirming the contributions of NRFM, FCAS, and DCAF. The work provides a practical, non-reliant-on-perfect-ground-truth solution for improving driving visibility and safety under haze, while acknowledging limitations such as sky-region artifacts and non-real-time performance.

Abstract

Real driving-video dehazing poses a significant challenge due to the inherent difficulty in acquiring precisely aligned hazy/clear video pairs for effective model training, especially in dynamic driving scenarios with unpredictable weather conditions. In this paper, we propose a pioneering approach that addresses this challenge through a nonaligned regularization strategy. Our core concept involves identifying clear frames that closely match hazy frames, serving as references to supervise a video dehazing network. Our approach comprises two key components: reference matching and video dehazing. Firstly, we introduce a non-aligned reference frame matching module, leveraging an adaptive sliding window to match high-quality reference frames from clear videos. Video dehazing incorporates flow-guided cosine attention sampler and deformable cosine attention fusion modules to enhance spatial multiframe alignment and fuse their improved information. To validate our approach, we collect a GoProHazy dataset captured effortlessly with GoPro cameras in diverse rural and urban road environments. Extensive experiments demonstrate the superiority of the proposed method over current state-of-the-art methods in the challenging task of real driving-video dehazing. Project page.

Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance

TL;DR

This work tackles the challenge of real-world driving-video dehazing where hazy/clear frames are not precisely aligned. It introduces a non-aligned regularization framework with NRFM to select non-aligned reference frames, and a video dehazing network that uses Flow-guided Cosine Attention Sampler (FCAS) and Deformable Cosine Attention Fusion (DCAF) to robustly align and fuse multi-frame information, aided by a pre-dehazing step. The approach is validated on the GoProHazy and DrivingHazy real-world datasets, achieving state-of-the-art FADE and NIQE scores and demonstrating strong generalization to InternetHazy and indoor REVIDE data, with ablations confirming the contributions of NRFM, FCAS, and DCAF. The work provides a practical, non-reliant-on-perfect-ground-truth solution for improving driving visibility and safety under haze, while acknowledging limitations such as sky-region artifacts and non-real-time performance.

Abstract

Real driving-video dehazing poses a significant challenge due to the inherent difficulty in acquiring precisely aligned hazy/clear video pairs for effective model training, especially in dynamic driving scenarios with unpredictable weather conditions. In this paper, we propose a pioneering approach that addresses this challenge through a nonaligned regularization strategy. Our core concept involves identifying clear frames that closely match hazy frames, serving as references to supervise a video dehazing network. Our approach comprises two key components: reference matching and video dehazing. Firstly, we introduce a non-aligned reference frame matching module, leveraging an adaptive sliding window to match high-quality reference frames from clear videos. Video dehazing incorporates flow-guided cosine attention sampler and deformable cosine attention fusion modules to enhance spatial multiframe alignment and fuse their improved information. To validate our approach, we collect a GoProHazy dataset captured effortlessly with GoPro cameras in diverse rural and urban road environments. Extensive experiments demonstrate the superiority of the proposed method over current state-of-the-art methods in the challenging task of real driving-video dehazing. Project page.
Paper Structure (23 sections, 13 equations, 20 figures, 6 tables, 1 algorithm)

This paper contains 23 sections, 13 equations, 20 figures, 6 tables, 1 algorithm.

Figures (20)

  • Figure 1: Spatial and temporal misalignments in real driving hazy/clear video pairs due to inconsistent driving speeds, different driving paths and moving objects.
  • Figure 2: (a) The overall framework of our driving-video dehazing (DVD) comprising two crucial components: frame matching and video dehazing. This involves applying frame dehazing to proactively eliminate haze from individual frames. One significant benefit is is the effectiveness and efficiency of our method in training the video dehazing network using authentic driving data without requiring strict alignment, ultimately producing high-quality results. (b) The illustration depicts the matching process of non-aligned, clear reference frames through the utilization of an adaptive sliding window using feature cosine similarity. Our input consists of two frames.
  • Figure 3: (a) Overview of guided pyramid cosine attention sampler (GPCAS). (b) The proposed FCSA module uses coarse optical flow sampling to enhance the receptive field for cosine correlation calculations. (c) Sampling and calculating cosine correlation.
  • Figure 4: Overview of proposed DCAF. Enhancing cosine correlation for pixel misalignment robustness by expanding the receptive field with DConv, thereby improving cosine fusion performance.
  • Figure 5: Vehicles with different speeds for data collection.
  • ...and 15 more figures