Slow - Motion Video Synthesis for Basketball Using Frame Interpolation
Jiantang Huang
TL;DR
This work addresses the challenge of producing high-fidelity slow-motion basketball video from standard broadcast footage by domain-adapting a fast video frame interpolation model to basketball content. The authors fine-tune RIFE on the basketball subset of SportsSloMo, using focused data preparation, 10 epochs of AdamW training, and simple augmentations. On held-out clips, the fine-tuned RIFE achieves $PSNR\approx 34.3$ dB and $SSIM\approx 0.949$, outperforming both the base RIFE and Super SloMo while delivering real-time or near-real-time synthesis on consumer GPUs (about $30$ fps). The results indicate that task-specific adaptation is crucial for sports slow-motion, offering a practical, efficient solution for broadcast and consumer applications; future improvements may include multi-frame context and perceptual losses to further boost quality.
Abstract
Basketball broadcast footage is traditionally captured at 30-60 fps, limiting viewers' ability to appreciate rapid plays such as dunks and crossovers. We present a real-time slow-motion synthesis system that produces high-quality basketball-specific interpolated frames by fine-tuning the recent Real-Time Intermediate Flow Estimation (RIFE) network on the SportsSloMo dataset. Our pipeline isolates the basketball subset of SportsSloMo, extracts training triplets, and fine-tunes RIFE with human-aware random cropping. We compare the resulting model against Super SloMo and the baseline RIFE model using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM) on held-out clips. The fine-tuned RIFE attains a mean PSNR of 34.3 dB and SSIM of 0.949, outperforming Super SloMo by 2.1 dB and the baseline RIFE by 1.3 dB. A lightweight Gradio interface demonstrates end-to-end 4x slow-motion generation on a single RTX 4070 Ti Super at approximately 30 fps. These results indicate that task-specific adaptation is crucial for sports slow-motion, and that RIFE provides an attractive accuracy-speed trade-off for consumer applications.
