Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
Levent Karacan, Zeynep Akata, Aykut Erdem, Erkut Erdem
TL;DR
This paper tackles automatic outdoor scene synthesis by conditioning image generation on semantic layouts and transient scene attributes. It introduces AL-CGAN, a conditional GAN where the generator uses layout and attribute conditioning and a Siamese discriminator fuses image and conditioning features, enabling precise object boundaries and diverse appearances. The model is trained on a fusion of ADE20K outdoor images and Transient Attributes data, demonstrating layout-controlled drawing of scene elements and attribute-driven appearance changes, including incremental scene editing. An ablation study shows that both layout and attribute conditioning improve realism, with future work aiming to extend to natural language conditioning.
Abstract
Automatic image synthesis research has been rapidly growing with deep networks getting more and more expressive. In the last couple of years, we have observed images of digits, indoor scenes, birds, chairs, etc. being automatically generated. The expressive power of image generators have also been enhanced by introducing several forms of conditioning variables such as object names, sentences, bounding box and key-point locations. In this work, we propose a novel deep conditional generative adversarial network architecture that takes its strength from the semantic layout and scene attributes integrated as conditioning variables. We show that our architecture is able to generate realistic outdoor scene images under different conditions, e.g. day-night, sunny-foggy, with clear object boundaries.
