Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, Shuchang Zhou
TL;DR
This work tackles real-time video frame interpolation by introducing RIFE, which directly estimates intermediate optical flows with IFNet and a fusion map to synthesize target frames without relying on pre-trained flow models. A privileged distillation strategy uses access to ground-truth intermediate frames to stabilize training and boost accuracy, while a coarse-to-fine IFNet design enables efficient, end-to-end learning. Temporal encoding allows arbitrary-timestep interpolation and broadening of applications beyond fixed-timestep outputs, further aided by a lightweight RefineNet refinement stage. Empirical results on Vimeo90K, HD, and X4K-1000FPS benchmarks show state-of-the-art performance with substantial speedups over prior flow-based methods, highlighting RIFE’s practical potential for real-time VFI on devices and more flexible temporal processing tasks.
Abstract
Real-time video frame interpolation (VFI) is very useful in video processing, media players, and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm for VFI. To realize a high-quality flow-based VFI method, RIFE uses a neural network named IFNet that can estimate the intermediate flows end-to-end with much faster speed. A privileged distillation scheme is designed for stable IFNet training and improve the overall performance. RIFE does not rely on pre-trained optical flow models and can support arbitrary-timestep frame interpolation with the temporal encoding input. Experiments demonstrate that RIFE achieves state-of-the-art performance on several public benchmarks. Compared with the popular SuperSlomo and DAIN methods, RIFE is 4--27 times faster and produces better results. Furthermore, RIFE can be extended to wider applications thanks to temporal encoding. The code is available at https://github.com/megvii-research/ECCV2022-RIFE.
