Table of Contents
Fetching ...

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering

Xinmin Qiu, Congying Han, Zicheng Zhang, Bonan Li, Tiande Guo, Pingyu Wang, Xuecheng Nie

TL;DR

Inspired by the classic scale-time equalization (STE), this work introduces the histogram-assisted solution, called BlazeBVD, for high-fidelity and rapid BVD, which leverages smoothed illumination histograms within STE filtering to ease the challenge of learning temporal data using neural networks.

Abstract

Developing blind video deflickering (BVD) algorithms to enhance video temporal consistency, is gaining importance amid the flourish of image processing and video generation. However, the intricate nature of video data complicates the training of deep learning methods, leading to high resource consumption and instability, notably under severe lighting flicker. This underscores the critical need for a compact representation beyond pixel values to advance BVD research and applications. Inspired by the classic scale-time equalization (STE), our work introduces the histogram-assisted solution, called BlazeBVD, for high-fidelity and rapid BVD. Compared with STE, which directly corrects pixel values by temporally smoothing color histograms, BlazeBVD leverages smoothed illumination histograms within STE filtering to ease the challenge of learning temporal data using neural networks. In technique, BlazeBVD begins by condensing pixel values into illumination histograms that precisely capture flickering and local exposure variations. These histograms are then smoothed to produce singular frames set, filtered illumination maps, and exposure maps. Resorting to these deflickering priors, BlazeBVD utilizes a 2D network to restore faithful and consistent texture impacted by lighting changes or localized exposure issues. BlazeBVD also incorporates a lightweight 3D network to amend slight temporal inconsistencies, avoiding the resource consumption issue. Comprehensive experiments on synthetic, real-world and generated videos, showcase the superior qualitative and quantitative results of BlazeBVD, achieving inference speeds up to 10x faster than state-of-the-arts.

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering

TL;DR

Inspired by the classic scale-time equalization (STE), this work introduces the histogram-assisted solution, called BlazeBVD, for high-fidelity and rapid BVD, which leverages smoothed illumination histograms within STE filtering to ease the challenge of learning temporal data using neural networks.

Abstract

Developing blind video deflickering (BVD) algorithms to enhance video temporal consistency, is gaining importance amid the flourish of image processing and video generation. However, the intricate nature of video data complicates the training of deep learning methods, leading to high resource consumption and instability, notably under severe lighting flicker. This underscores the critical need for a compact representation beyond pixel values to advance BVD research and applications. Inspired by the classic scale-time equalization (STE), our work introduces the histogram-assisted solution, called BlazeBVD, for high-fidelity and rapid BVD. Compared with STE, which directly corrects pixel values by temporally smoothing color histograms, BlazeBVD leverages smoothed illumination histograms within STE filtering to ease the challenge of learning temporal data using neural networks. In technique, BlazeBVD begins by condensing pixel values into illumination histograms that precisely capture flickering and local exposure variations. These histograms are then smoothed to produce singular frames set, filtered illumination maps, and exposure maps. Resorting to these deflickering priors, BlazeBVD utilizes a 2D network to restore faithful and consistent texture impacted by lighting changes or localized exposure issues. BlazeBVD also incorporates a lightweight 3D network to amend slight temporal inconsistencies, avoiding the resource consumption issue. Comprehensive experiments on synthetic, real-world and generated videos, showcase the superior qualitative and quantitative results of BlazeBVD, achieving inference speeds up to 10x faster than state-of-the-arts.
Paper Structure (30 sections, 13 equations, 5 figures, 4 tables)

This paper contains 30 sections, 13 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparisons of the proposed BlazeBVD. We present flickering input, GT, Deflicker Lei_2023 and our BlazeBVD processed video frames, and illumination histograms along with KL divergence about GroundTruth. Our method recovers the illumination histograms well while avoiding the appearance of color artifacts and color distortions (such as the man's arm in the second column). Better see in color with 2$\times$ zoom.
  • Figure 2: The framework of our approach BlazeBVD. We first extract flicker prior information about the input video and correct the brightness representation with STE-assisted histogram filtering in illuminance space (Stage1). Then, the prior is leveraged to remove temporal flicker and over-/under-exposure flicker from the global and local perspectives (GFRM and LFRM in Stage2). Finally, the temporal consistency of the processed video is improved (TCM in Stage3).
  • Figure 3: Pipeline of Stage1. The illumination maps are corrected in the temporal dimension by STE, which alleviates the difficulty of subsequent temporal data learning using networks. The deflickering priors can be extracted: $\tilde{V}_t$, $M_t$ and $S_{flicker}$.
  • Figure 4: Qualitative comparisons between previous methods and our BlazeBVD. Our method removes flicker and restores details in over-exposed regions (red boxes in row 6 and row 8) while avoiding color artifacts (row 2 and row 4). Besides, BlazeBVD also ensures the fidelity of the video content and avoids color distortion (cyan boxes in row 5 and row 7). Zoom in for the best view and we recommend watching videos in the supplementary materials.
  • Figure 5: Qualitative ablation studies of key designs in BlazeBVD. Specifically, local detail loss is represented in cyan box, color artifact is represented in the red box, temporal inconsistency is represented in the black box, and color distortion is represented in the blue box. Zoom in for the best view.