High-Resolution Daytime Translation Without Domain Labels
Ivan Anokhin, Pavel Solovev, Denis Korzhenkov, Alexey Kharlamov, Taras Khakhulin, Alexey Silvestrov, Sergey Nikolenko, Victor Lempitsky, Gleb Sterkin
TL;DR
HiDT tackles the problem of daytime translation for high-resolution landscape images without relying on domain labels. It introduces a content/style disentangled architecture with AdaIN-based generation and augmented skip connections, complemented by a postprocessing enhancement pipeline to produce high-resolution outputs. The method is trained on unaligned images with weak segmentation supervision and employs a suite of losses, including a CORAL-inspired style distribution loss, to learn robust style transfer. Experiments show HiDT is competitive with label-dependent baselines and generalizes to other domains, with practical applications such as timelapse generation from single images.
Abstract
Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present the high-resolution daytime translation (HiDT) model for this task. HiDT combines a generative image-to-image model and a new upsampling scheme that allows to apply image translation at high resolution. The model demonstrates competitive results in terms of both commonly used GAN metrics and human evaluation. Importantly, this good performance comes as a result of training on a dataset of still landscape images with no daytime labels available. Our results are available at https://saic-mdal.github.io/HiDT/.
