Table of Contents
Fetching ...

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Guangyang Wu, Xin Tao, Changlin Li, Wenyi Wang, Xiaohong Liu, Qingqing Zheng

TL;DR

This work proposes a new paradigm for Video Frame Interpolation that incorporates an Asymmetric Synergistic Blending module (ASB) that utilizes features from both sides to synergistically blend intermediate features and introduces a self-learned sparse quasi-binary mask which effectively mitigates ghosting and blur artifacts in the output.

Abstract

Previous methods for Video Frame Interpolation (VFI) have encountered challenges, notably the manifestation of blur and ghosting effects. These issues can be traced back to two pivotal factors: unavoidable motion errors and misalignment in supervision. In practice, motion estimates often prove to be error-prone, resulting in misaligned features. Furthermore, the reconstruction loss tends to bring blurry results, particularly in misaligned regions. To mitigate these challenges, we propose a new paradigm called PerVFI (Perception-oriented Video Frame Interpolation). Our approach incorporates an Asymmetric Synergistic Blending module (ASB) that utilizes features from both sides to synergistically blend intermediate features. One reference frame emphasizes primary content, while the other contributes complementary information. To impose a stringent constraint on the blending process, we introduce a self-learned sparse quasi-binary mask which effectively mitigates ghosting and blur artifacts in the output. Additionally, we employ a normalizing flow-based generator and utilize the negative log-likelihood loss to learn the conditional distribution of the output, which further facilitates the generation of clear and fine details. Experimental results validate the superiority of PerVFI, demonstrating significant improvements in perceptual quality compared to existing methods. Codes are available at \url{https://github.com/mulns/PerVFI}

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

TL;DR

This work proposes a new paradigm for Video Frame Interpolation that incorporates an Asymmetric Synergistic Blending module (ASB) that utilizes features from both sides to synergistically blend intermediate features and introduces a self-learned sparse quasi-binary mask which effectively mitigates ghosting and blur artifacts in the output.

Abstract

Previous methods for Video Frame Interpolation (VFI) have encountered challenges, notably the manifestation of blur and ghosting effects. These issues can be traced back to two pivotal factors: unavoidable motion errors and misalignment in supervision. In practice, motion estimates often prove to be error-prone, resulting in misaligned features. Furthermore, the reconstruction loss tends to bring blurry results, particularly in misaligned regions. To mitigate these challenges, we propose a new paradigm called PerVFI (Perception-oriented Video Frame Interpolation). Our approach incorporates an Asymmetric Synergistic Blending module (ASB) that utilizes features from both sides to synergistically blend intermediate features. One reference frame emphasizes primary content, while the other contributes complementary information. To impose a stringent constraint on the blending process, we introduce a self-learned sparse quasi-binary mask which effectively mitigates ghosting and blur artifacts in the output. Additionally, we employ a normalizing flow-based generator and utilize the negative log-likelihood loss to learn the conditional distribution of the output, which further facilitates the generation of clear and fine details. Experimental results validate the superiority of PerVFI, demonstrating significant improvements in perceptual quality compared to existing methods. Codes are available at \url{https://github.com/mulns/PerVFI}
Paper Structure (29 sections, 16 equations, 7 figures, 3 tables)

This paper contains 29 sections, 16 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: We present challenging video frame interpolation examples, demonstrating our approach excels in handling large motion, outperforming alternatives prone to blurriness or ghosting.
  • Figure 2: (a): Overview of the entire PerVFI framework. (b): Structure of the proposed Asymmetric Synergistic Blending (ASB) module. (c): Structure of the conditional normalizing flow-based generator.
  • Figure 3: The Adaptive Dilation Module (ADM) produces the quasi-binary mask $\widetilde{M}^l_b$ by leveraging the binary occlusion mask $M_b^l$ and the two aligned feature sets $f_{t,0}$ and $f_{t,1}$. Panel (a) provides a visualization of the input and output masks, while panel (b) presents the flowchart outlining the operations of ADM. Further information regarding the intricacies of the module is detailed in Equations \ref{['eq:5']} - \ref{['eq:8']}.
  • Figure 4: User study results.
  • Figure 5: Perceptual quality comparison between different methods. Our approach produces a high-quality result in spite of the fast-moving objects that is subject to large motion. Red arrows emphasize areas where PerVFI excels in visual quality compared to other methods.
  • ...and 2 more figures