VMAF Re-implementation on PyTorch: Some Experimental Results
Kirill Aistov, Maxim Koroteev
TL;DR
The paper presents a PyTorch re-implementation of VMAF to enable differentiable optimization, addressing claims that VMAF is non-differentiable. It validates gradient behavior through gradient checking and demonstrates that VMAF can serve as a loss for training, including an experiment that learns a single 7×7 preprocessing filter via SGD and outperforms unsharp masking in VMAF-based quality improvements. The results show a discrepancy of $<10^{-2}$ between PyTorch VMAF and the libvmaf reference, with well-behaved gradients and practical timing for offline training and real-time filtering; the approach is validated on HEVC RD curves and Netflix data. Overall, the work enables differentiable VMAF-based optimization for video processing tasks and provides insights into the practical use of VMAF as a learning objective.
Abstract
Based on the standard VMAF implementation we propose an implementation of VMAF using PyTorch framework. For this implementation comparisons with the standard (libvmaf) show the discrepancy $\lesssim 10^{-2}$ in VMAF units. We investigate gradients computation when using VMAF as an objective function and demonstrate that training using this function does not result in ill-behaving gradients. The implementation is then used to train a preprocessing filter. It is demonstrated that its performance is superior to the unsharp masking filter. The resulting filter is also easy for implementation and can be applied in video processing tasks for video copression improvement. This is confirmed by the results of numerical experiments.
