Sparse Global Matching for Video Frame Interpolation with Large Motion
Chunxu Liu, Guozhen Zhang, Rui Zhao, Limin Wang
TL;DR
This work tackles the difficulty of large-motion video frame interpolation by introducing a two-branch framework that fuses local intermediate-flow estimation with a sparse global matching branch. The method starts with a high-resolution local feature-based estimate of the intermediate flows, then identifies flawed regions via a difference-based mechanism and computes sparse flow compensation using a global receptive field. An adaptive Flow Merge Block fuses local and sparse global information to produce refined intermediate flows, which are further refined to synthesize the target frame. The approach yields state-of-the-art performance on challenging large-motion benchmarks while maintaining strong results on small-to-medium motion data, demonstrating the practical potential of combining local detail with targeted global correspondences for VFI.
Abstract
Large motion poses a critical challenge in Video Frame Interpolation (VFI) task. Existing methods are often constrained by limited receptive fields, resulting in sub-optimal performance when handling scenarios with large motion. In this paper, we introduce a new pipeline for VFI, which can effectively integrate global-level information to alleviate issues associated with large motion. Specifically, we first estimate a pair of initial intermediate flows using a high-resolution feature map for extracting local details. Then, we incorporate a sparse global matching branch to compensate for flow estimation, which consists of identifying flaws in initial flows and generating sparse flow compensation with a global receptive field. Finally, we adaptively merge the initial flow estimation with global flow compensation, yielding a more accurate intermediate flow. To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Our method demonstrates the state-of-the-art performance on these VFI subsets with large motion.
