Table of Contents
Fetching ...

Latent Image and Video Resolution Prediction using Convolutional Neural Networks

Rittwika Kansabanik, Adrian Barbu

TL;DR

This paper formulates the problem, constructs a dataset for training and evaluation, and introduces several machine learning algorithms, including two Convolutional Neural Networks (CNN) to address this problem.

Abstract

This paper introduces a Video Quality Assessment (VQA) problem that has received little attention in the literature, called the latent resolution prediction problem. The problem arises when images or videos are upscaled from their native resolution and are reported as having a higher resolution than their native resolution. This paper formulates the problem, constructs a dataset for training and evaluation, and introduces several machine learning algorithms, including two Convolutional Neural Networks (CNNs), to address this problem. Experiments indicate that some proposed methods can predict the latent video resolution with about 95% accuracy.

Latent Image and Video Resolution Prediction using Convolutional Neural Networks

TL;DR

This paper formulates the problem, constructs a dataset for training and evaluation, and introduces several machine learning algorithms, including two Convolutional Neural Networks (CNN) to address this problem.

Abstract

This paper introduces a Video Quality Assessment (VQA) problem that has received little attention in the literature, called the latent resolution prediction problem. The problem arises when images or videos are upscaled from their native resolution and are reported as having a higher resolution than their native resolution. This paper formulates the problem, constructs a dataset for training and evaluation, and introduces several machine learning algorithms, including two Convolutional Neural Networks (CNNs), to address this problem. Experiments indicate that some proposed methods can predict the latent video resolution with about 95% accuracy.

Paper Structure

This paper contains 13 sections, 2 equations, 3 figures, 2 tables, 2 algorithms.

Figures (3)

  • Figure 1: (Illustration) An image or video is claimed to be of a specific resolution (right), but it is an up-scaled version of a lower-resolution image (middle). To simplify the problem, we start with a high-resolution image (left), which is down-scaled and then up-scaled back to the original resolution. The problem is to predict the downscale/upscale factor used.
  • Figure 2: Mask CNN $R^2$ (left) and SoftMax CNN accuracy (right) vs epoch number.
  • Figure 3: Prediction Accuracy(%) vs. per-frame (left) and per-video (right) aggregation percentiles for Mask-SoftMax CNN.