Table of Contents
Fetching ...

CNNs for Style Transfer of Digital to Film Photography

Pierre Mackenzie, Mika Senghaas, Raphael Achddou

TL;DR

This work uses simple convolutional neural networks to model Cinestill800T film given a digital input and finds that a combination of MSE/VGG loss gives the best colour production and some grain can be produced, but it is not of a high quality, and no halation is produced.

Abstract

The use of deep learning in stylistic effect generation has seen increasing use over recent years. In this work, we use simple convolutional neural networks to model Cinestill800T film given a digital input. We test the effect of different loss functions, the addition of an input noise channel and the use of random scales of patches during training. We find that a combination of MSE/VGG loss gives the best colour production and that some grain can be produced, but it is not of a high quality, and no halation is produced. We contribute our dataset of aligned paired images taken with a film and digital camera for further work.

CNNs for Style Transfer of Digital to Film Photography

TL;DR

This work uses simple convolutional neural networks to model Cinestill800T film given a digital input and finds that a combination of MSE/VGG loss gives the best colour production and some grain can be produced, but it is not of a high quality, and no halation is produced.

Abstract

The use of deep learning in stylistic effect generation has seen increasing use over recent years. In this work, we use simple convolutional neural networks to model Cinestill800T film given a digital input. We test the effect of different loss functions, the addition of an input noise channel and the use of random scales of patches during training. We find that a combination of MSE/VGG loss gives the best colour production and that some grain can be produced, but it is not of a high quality, and no halation is produced. We contribute our dataset of aligned paired images taken with a film and digital camera for further work.

Paper Structure

This paper contains 16 sections, 6 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Raw Paired Image Dataset. Examples of raw image pairs from the dataset. A column shows a single scene captured with a digital camera (top) and a film camera (bottom). The images show a wide range of scenes and visual effects.
  • Figure 2: Dataset Preprocessing. Example of a raw and processed image pair. We can see the luminance alignment in the film image (a) and the spatial alignment in the digital image (b).
  • Figure 3: Training Overview. We train our model on paired digital-film images.
  • Figure 4: Single Image Select Samples. Outputs from select loss functions and configurations of noise and resizing. We see that the best performing model is MSE/VGG with resizing, which produces the best colour effect. We also see that LPIPS scores are inconsistent with SSIM and with perception as we can intuitively see that MSE-VGG and MSE are better predicitions than the baseline and TV-Rel.
  • Figure 5: Single Image Noise Comparison. Outputs from select models without noise (above) and with noise (below), no resizing. We see that when we feed noise into the model, the model learns to produce some variation, especially when the loss contains a feature-based metric like VGG. However, the grain is far from the desired effect.
  • ...and 4 more figures