Table of Contents
Fetching ...

Cascading Refinement Video Denoising with Uncertainty Adaptivity

Xinyuan Yu

TL;DR

The paper tackles video denoising under realistic, multi-level noise where alignment accuracy is critical for restoration. It introduces a cascading refinement framework that jointly refines optical-flow-based alignment and frame restoration, augmented by an uncertainty map after each iteration to guide early stopping. Key innovations include pre-denoising with patch matching, a RAFT-inspired iterative flow estimator with pyramid correlation, a flow-guided deformable convolution-based reconstruction network, and an uncertainty-adaptive loss that reduces computation while achieving state-of-the-art results on the CRVD dataset. The approach enhances robustness to varying noise levels and offers practical gains for real-world video analysis and downstream tasks.

Abstract

Accurate alignment is crucial for video denoising. However, estimating alignment in noisy environments is challenging. This paper introduces a cascading refinement video denoising method that can refine alignment and restore images simultaneously. Better alignment enables restoration of more detailed information in each frame. Furthermore, better image quality leads to better alignment. This method has achieved SOTA performance by a large margin on the CRVD dataset. Simultaneously, aiming to deal with multi-level noise, an uncertainty map was created after each iteration. Because of this, redundant computation on the easily restored videos was avoided. By applying this method, the entire computation was reduced by 25% on average.

Cascading Refinement Video Denoising with Uncertainty Adaptivity

TL;DR

The paper tackles video denoising under realistic, multi-level noise where alignment accuracy is critical for restoration. It introduces a cascading refinement framework that jointly refines optical-flow-based alignment and frame restoration, augmented by an uncertainty map after each iteration to guide early stopping. Key innovations include pre-denoising with patch matching, a RAFT-inspired iterative flow estimator with pyramid correlation, a flow-guided deformable convolution-based reconstruction network, and an uncertainty-adaptive loss that reduces computation while achieving state-of-the-art results on the CRVD dataset. The approach enhances robustness to varying noise levels and offers practical gains for real-world video analysis and downstream tasks.

Abstract

Accurate alignment is crucial for video denoising. However, estimating alignment in noisy environments is challenging. This paper introduces a cascading refinement video denoising method that can refine alignment and restore images simultaneously. Better alignment enables restoration of more detailed information in each frame. Furthermore, better image quality leads to better alignment. This method has achieved SOTA performance by a large margin on the CRVD dataset. Simultaneously, aiming to deal with multi-level noise, an uncertainty map was created after each iteration. Because of this, redundant computation on the easily restored videos was avoided. By applying this method, the entire computation was reduced by 25% on average.
Paper Structure (18 sections, 15 equations, 8 figures, 2 tables)

This paper contains 18 sections, 15 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: The process of patch matching is conducted on the pre-denoised images. The matched patches(indicated by yellow rectangle) are set as the input of the cascading refinement
  • Figure 2: The iterative refinement optical flow estimation is shown as the upper part of this graph. The bottom part of this graph illustrates the cascading image reconstruction network. The network uses matched patches from three continuous frames as input. For simplicity, we only draw patches from two frames.
  • Figure 3: The reconstruction block contains a flow-guided deformable module. A fusion module is added to fuse the aligned feature maps of supporting frames and the reference frame. Two heads follow the fused feature maps for image reconstruction and uncertainty estimation separately. The blue rectangle denotes deformable convolutions.
  • Figure 4: After each iteration, we calculate the mean value of the uncertainty map for each patch. If the mean value is larger than a certain threshold, we directly output the denoised result. Otherwise, we continue the iterative refinement procedure.
  • Figure 5: Comparison on a CRVD indoor scene with ISO 25600. Zoom in for better observation.
  • ...and 3 more figures