Table of Contents
Fetching ...

Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan

TL;DR

This work tackles the challenge of improving video quality with diffusion-based methods while preserving the original content. It introduces Noise Calibration, a training-free, plug-and-play optimization that refines the initial noise over 1–3 iterations and enforces content consistency by operating on low-frequency components through a $f_l^\nu$/$f_h^\nu$ decomposition. By embedding these constraints into a pre-trained video diffusion framework, the method (NC-SDEdit) achieves enhanced visual quality with markedly better content preservation than standard SDEdit, and it also provides improvements when integrated with state-of-the-art refinement models. The approach is validated on a 700-video EvalCrafter-derived set, showing strong quantitative gains across multiple metrics and robust qualitative improvements, with practical benefits including low training cost and fast inference.

Abstract

In order to improve the quality of synthesized videos, currently, one predominant method involves retraining an expert diffusion model and then implementing a noising-denoising process for refinement. Despite the significant training costs, maintaining consistency of content between the original and enhanced videos remains a major challenge. To tackle this challenge, we propose a novel formulation that considers both visual quality and consistency of content. Consistency of content is ensured by a proposed loss function that maintains the structure of the input, while visual quality is improved by utilizing the denoising process of pretrained diffusion models. To address the formulated optimization problem, we have developed a plug-and-play noise optimization strategy, referred to as Noise Calibration. By refining the initial random noise through a few iterations, the content of original video can be largely preserved, and the enhancement effect demonstrates a notable improvement. Extensive experiments have demonstrated the effectiveness of the proposed method.

Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

TL;DR

This work tackles the challenge of improving video quality with diffusion-based methods while preserving the original content. It introduces Noise Calibration, a training-free, plug-and-play optimization that refines the initial noise over 1–3 iterations and enforces content consistency by operating on low-frequency components through a / decomposition. By embedding these constraints into a pre-trained video diffusion framework, the method (NC-SDEdit) achieves enhanced visual quality with markedly better content preservation than standard SDEdit, and it also provides improvements when integrated with state-of-the-art refinement models. The approach is validated on a 700-video EvalCrafter-derived set, showing strong quantitative gains across multiple metrics and robust qualitative improvements, with practical benefits including low training cost and fast inference.

Abstract

In order to improve the quality of synthesized videos, currently, one predominant method involves retraining an expert diffusion model and then implementing a noising-denoising process for refinement. Despite the significant training costs, maintaining consistency of content between the original and enhanced videos remains a major challenge. To tackle this challenge, we propose a novel formulation that considers both visual quality and consistency of content. Consistency of content is ensured by a proposed loss function that maintains the structure of the input, while visual quality is improved by utilizing the denoising process of pretrained diffusion models. To address the formulated optimization problem, we have developed a plug-and-play noise optimization strategy, referred to as Noise Calibration. By refining the initial random noise through a few iterations, the content of original video can be largely preserved, and the enhancement effect demonstrates a notable improvement. Extensive experiments have demonstrated the effectiveness of the proposed method.
Paper Structure (15 sections, 15 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 15 sections, 15 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Examples demonstrating video enhancement based on SDEdit
  • Figure 2: Decomposition of the video enhancement process based on SDEdit
  • Figure 3: Visual comparisons of video enhancement based on VideoCrafter chen2023videocrafter1
  • Figure 4: Comparison with entirely different methods
  • Figure 5: Visual comparisons about iteration steps $N$ and threshold frequency $\nu$
  • ...and 4 more figures