Table of Contents
Fetching ...

Motion-Aware Generative Frame Interpolation

Guozhen Zhang, Yuhan Zhu, Yutao Cui, Xiaotong Zhao, Kai Ma, Limin Wang

TL;DR

MoG tackles artifacts in flow-based frame interpolation caused by complex motion by integrating intermediate flow guidance with diffusion-based generative refinement. It introduces dual guidance injection at latent and feature levels to align motion with flow trajectories, paired with encoder-only guidance and selective parameter fine-tuning to dynamically correct flow errors. Extensive experiments on real-world and animation benchmarks show MoG substantially improves video quality and fidelity over both flow-based and prior generative VFI methods, while maintaining efficiency. This work bridges flow-based stability and generative flexibility, enabling robust frame interpolation across diverse scenarios.

Abstract

Flow-based frame interpolation methods ensure motion stability through estimated intermediate flow but often introduce severe artifacts in complex motion regions. Recent generative approaches, boosted by large-scale pre-trained video generation models, show promise in handling intricate scenes. However, they frequently produce unstable motion and content inconsistencies due to the absence of explicit motion trajectory constraints. To address these challenges, we propose Motion-aware Generative frame interpolation (MoG) that synergizes intermediate flow guidance with generative capacities to enhance interpolation fidelity. Our key insight is to simultaneously enforce motion smoothness through flow constraints while adaptively correcting flow estimation errors through generative refinement. Specifically, we first introduce a dual guidance injection that propagates condition information using intermediate flow at both latent and feature levels, aligning the generated motion with flow-derived motion trajectories. Meanwhile, we implemented two critical designs, encoder-only guidance injection and selective parameter fine-tuning, which enable dynamic artifact correction in the complex motion regions. Extensive experiments on both real-world and animation benchmarks demonstrate that MoG outperforms state-of-the-art methods in terms of video quality and visual fidelity. Our work bridges the gap between flow-based stability and generative flexibility, offering a versatile solution for frame interpolation across diverse scenarios.

Motion-Aware Generative Frame Interpolation

TL;DR

MoG tackles artifacts in flow-based frame interpolation caused by complex motion by integrating intermediate flow guidance with diffusion-based generative refinement. It introduces dual guidance injection at latent and feature levels to align motion with flow trajectories, paired with encoder-only guidance and selective parameter fine-tuning to dynamically correct flow errors. Extensive experiments on real-world and animation benchmarks show MoG substantially improves video quality and fidelity over both flow-based and prior generative VFI methods, while maintaining efficiency. This work bridges flow-based stability and generative flexibility, enabling robust frame interpolation across diverse scenarios.

Abstract

Flow-based frame interpolation methods ensure motion stability through estimated intermediate flow but often introduce severe artifacts in complex motion regions. Recent generative approaches, boosted by large-scale pre-trained video generation models, show promise in handling intricate scenes. However, they frequently produce unstable motion and content inconsistencies due to the absence of explicit motion trajectory constraints. To address these challenges, we propose Motion-aware Generative frame interpolation (MoG) that synergizes intermediate flow guidance with generative capacities to enhance interpolation fidelity. Our key insight is to simultaneously enforce motion smoothness through flow constraints while adaptively correcting flow estimation errors through generative refinement. Specifically, we first introduce a dual guidance injection that propagates condition information using intermediate flow at both latent and feature levels, aligning the generated motion with flow-derived motion trajectories. Meanwhile, we implemented two critical designs, encoder-only guidance injection and selective parameter fine-tuning, which enable dynamic artifact correction in the complex motion regions. Extensive experiments on both real-world and animation benchmarks demonstrate that MoG outperforms state-of-the-art methods in terms of video quality and visual fidelity. Our work bridges the gap between flow-based stability and generative flexibility, offering a versatile solution for frame interpolation across diverse scenarios.
Paper Structure (28 sections, 10 equations, 8 figures, 5 tables)

This paper contains 28 sections, 10 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Examples of frame interpolation in real-world and animation scenes. Compared to other methods, our approach, MoG, exhibits superior stability in motion and consistency in appearance details.
  • Figure 2: Overview of MoG. MoG consists of two parts. First, it extracts the intermediate flow between input frames. Subsequently, this guidance is seamlessly injected into the generative model at both the latent and feature levels. Meanwhile, the generative model would adaptively rectifying the errors by two crucial designs, namely, encoder-only guidance injection and selective parameter fine-tuning.
  • Figure 3: Demonstration of MoG's guidance correction capability. In both examples, DynamiCrafter or ToonCrafter struggle to generate temporally consistent motion in complex scenarios. While intermediate flow can provide valuable motion cues, it often introduces artifacts and fails to render fine appearance details. Leveraging the encoder-only injection design and selective parameter fine-tuning, MoG effectively integrates reliable motion information from intermediate flow while correcting its inaccuracies.
  • Figure 4: Visual comparison on real-world and animation scenes.
  • Figure 5: An example of manually altering intermediate flow.
  • ...and 3 more figures