Mechanisms of Generative Image-to-Image Translation Networks
Guangzong Chen, Mingui Sun, Zhi-Hong Mao, Kangni Liu, Wenyan Jia
TL;DR
The paper explains why GAN-based image-to-image translation can match autoencoder approaches without additional penalties by analyzing the GAN and autoencoder objective as equivalent under two conditions: the generator can reconstruct the input and the discriminator can perfectly distinguish real from generated data. It provides algebraic and geometric interpretations showing that, with sufficient discriminator capacity, adversarial training pushes the generated image toward the input, effectively behaving like an autoencoder and enabling translation between domains when using separate shape and texture datasets. Extending this view to image-to-image translation, the work uses two datasets to separate content (shape) from style (texture) and demonstrates translations that preserve global structure while altering texture, with explanations grounded in content-style decomposition. Empirical results across AFHQ, FFHQ, and artwork-style translations validate the approach and reveal how encoder dimensionality and dataset size influence the balance between preserving content and transferring style, offering a simpler, penalty-free alternative for certain translation tasks.
Abstract
Generative Adversarial Networks (GANs) are a class of neural networks that have been widely used in the field of image-to-image translation. In this paper, we propose a streamlined image-to-image translation network with a simpler architecture compared to existing models. We investigate the relationship between GANs and autoencoders and provide an explanation for the efficacy of employing only the GAN component for tasks involving image translation. We show that adversarial for GAN models yields results comparable to those of existing methods without additional complex loss penalties. Subsequently, we elucidate the rationale behind this phenomenon. We also incorporate experimental results to demonstrate the validity of our findings.
