Table of Contents
Fetching ...

Binarized Low-light Raw Video Enhancement

Gengchen Zhang, Yulun Zhang, Xin Yuan, Ying Fu

TL;DR

This work tackles the challenge of low-light raw video enhancement on resource-limited devices by binarizing the entire pipeline. It introduces BRVE, a compact binary framework that combines distribution-aware binary convolution (DABC) with a spatial-temporal shift mechanism to fuse temporal information while maintaining binarized efficiency. The key contributions are the BRVE architecture with recurrent embeddings, the DABC module augmented by distribution-aware channel attention (DACA), and a parameter-free shift-based fusion strategy that preserves temporal consistency in noisy low-light videos. Experimental results on SMOID and LLRVD demonstrate BRVE achieves competitive or superior performance to some full-precision models with significantly reduced FLOPs and parameters, enabling practical edge-device deployment for real-time low-light video enhancement.

Abstract

Recently, deep neural networks have achieved excellent performance on low-light raw video enhancement. However, they often come with high computational complexity and large memory costs, which hinder their applications on resource-limited devices. In this paper, we explore the feasibility of applying the extremely compact binary neural network (BNN) to low-light raw video enhancement. Nevertheless, there are two main issues with binarizing video enhancement models. One is how to fuse the temporal information to improve low-light denoising without complex modules. The other is how to narrow the performance gap between binary convolutions with the full precision ones. To address the first issue, we introduce a spatial-temporal shift operation, which is easy-to-binarize and effective. The temporal shift efficiently aggregates the features of neighbor frames and the spatial shift handles the misalignment caused by the large motion in videos. For the second issue, we present a distribution-aware binary convolution, which captures the distribution characteristics of real-valued input and incorporates them into plain binary convolutions to alleviate the degradation in performance. Extensive quantitative and qualitative experiments have shown our high-efficiency binarized low-light raw video enhancement method can attain a promising performance.

Binarized Low-light Raw Video Enhancement

TL;DR

This work tackles the challenge of low-light raw video enhancement on resource-limited devices by binarizing the entire pipeline. It introduces BRVE, a compact binary framework that combines distribution-aware binary convolution (DABC) with a spatial-temporal shift mechanism to fuse temporal information while maintaining binarized efficiency. The key contributions are the BRVE architecture with recurrent embeddings, the DABC module augmented by distribution-aware channel attention (DACA), and a parameter-free shift-based fusion strategy that preserves temporal consistency in noisy low-light videos. Experimental results on SMOID and LLRVD demonstrate BRVE achieves competitive or superior performance to some full-precision models with significantly reduced FLOPs and parameters, enabling practical edge-device deployment for real-time low-light video enhancement.

Abstract

Recently, deep neural networks have achieved excellent performance on low-light raw video enhancement. However, they often come with high computational complexity and large memory costs, which hinder their applications on resource-limited devices. In this paper, we explore the feasibility of applying the extremely compact binary neural network (BNN) to low-light raw video enhancement. Nevertheless, there are two main issues with binarizing video enhancement models. One is how to fuse the temporal information to improve low-light denoising without complex modules. The other is how to narrow the performance gap between binary convolutions with the full precision ones. To address the first issue, we introduce a spatial-temporal shift operation, which is easy-to-binarize and effective. The temporal shift efficiently aggregates the features of neighbor frames and the spatial shift handles the misalignment caused by the large motion in videos. For the second issue, we present a distribution-aware binary convolution, which captures the distribution characteristics of real-valued input and incorporates them into plain binary convolutions to alleviate the degradation in performance. Extensive quantitative and qualitative experiments have shown our high-efficiency binarized low-light raw video enhancement method can attain a promising performance.
Paper Structure (32 sections, 11 equations, 7 figures, 4 tables)

This paper contains 32 sections, 11 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Efficiency and performance comparison of full precision networks and binary neural networks (BNNs).
  • Figure 2: Overall architecture of BRVE model. (a) BRVE uses a shift binary U-Net for local feature fusion and exploits recurrent embeddings for long-range feature propagation. (b) Shift binary U-Net. (c) Shift encoder/decoder consists of a shift operation, a binary fusion block, and several binary conv blocks using the distribution-aware binary convolution (DABC).
  • Figure 3: Distribution-Aware Binary Convolution (DABC).
  • Figure 4: Spatial-temporal shift operation.
  • Figure 5: Visual comparison of different low-light video enhancement methods on SMOID datasets.
  • ...and 2 more figures