Table of Contents
Fetching ...

Predicting total time to compress a video corpus using online inference systems

Xin Shu, Vibhoothi Vibhoothi, Anil Kokaram

TL;DR

This work proposes new Machine Learning systems which predict cost for the entire corpus instead of perclip compression time, and finds that the accuracy of aggregate time prediction for a video corpus is more than two times better than using per-clip predictions.

Abstract

Predicting the computational cost of compressing/transcoding clips in a video corpus is important for resource management of cloud services and VOD (Video On Demand) providers. Currently, customers of cloud video services are unaware of the cost of transcoding their files until the task is completed. Previous work concentrated on predicting perclip compression time, and thus estimating the cost of video compression. In this work, we propose new Machine Learning (ML) systems which predict cost for the entire corpus instead. This is a more appropriate goal since users are not interested in per-clip cost but instead the cost for the whole corpus. In this work, we evaluate our systems with respect to two video codecs (x264, x265) and a novel high-quality video corpus. We find that the accuracy of aggregate time prediction for a video corpus more than two times better than using per-clip predictions. Furthermore, we present an online inference framework in which we update the ML models as files are processed. A consideration of video compute overhead and appropriate choice of ML predictor for each fraction of corpus completed yields a prediction error of less than 5%. This is approximately two times better than previous work which proposed generalised predictors.

Predicting total time to compress a video corpus using online inference systems

TL;DR

This work proposes new Machine Learning systems which predict cost for the entire corpus instead of perclip compression time, and finds that the accuracy of aggregate time prediction for a video corpus is more than two times better than using per-clip predictions.

Abstract

Predicting the computational cost of compressing/transcoding clips in a video corpus is important for resource management of cloud services and VOD (Video On Demand) providers. Currently, customers of cloud video services are unaware of the cost of transcoding their files until the task is completed. Previous work concentrated on predicting perclip compression time, and thus estimating the cost of video compression. In this work, we propose new Machine Learning (ML) systems which predict cost for the entire corpus instead. This is a more appropriate goal since users are not interested in per-clip cost but instead the cost for the whole corpus. In this work, we evaluate our systems with respect to two video codecs (x264, x265) and a novel high-quality video corpus. We find that the accuracy of aggregate time prediction for a video corpus more than two times better than using per-clip predictions. Furthermore, we present an online inference framework in which we update the ML models as files are processed. A consideration of video compute overhead and appropriate choice of ML predictor for each fraction of corpus completed yields a prediction error of less than 5%. This is approximately two times better than previous work which proposed generalised predictors.

Paper Structure

This paper contains 9 sections, 4 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Scatter plot of the distribution of complexity features of the 600 segments. The x-axis indicates the Spatial Energy ($E$) and the y-axis indicates the Temporal Energy ($h$). The axes are transformed into logarithmic scales. The source datasets are labelled with colours.