Paired and Unpaired Image to Image Translation using Generative Adversarial Networks
Gaurav Kumar, Soham Satyadharma, Harpreet Singh
TL;DR
The paper tackles image-to-image translation in both paired and unpaired settings using GAN-based frameworks. It adopts Pix2Pix for paired translation and CycleGAN for unpaired translation, systematically evaluating loss functions ($L_{cGAN}$, $L_{L1}$, $L_{L2}$) and PatchGAN configurations, guided by metrics including precision, recall, and $\mathrm{FID}$. Key findings show that $L1$ loss yields the strongest performance in paired tasks and that paired translations outperform unpaired ones, with $70\times70$ PatchGAN generally outperforming larger patches in some aspects. The work offers a unified approach to cross-domain translation across multiple datasets and provides comprehensive quantitative and qualitative analyses to inform practical deployments.
Abstract
Image to image translation is an active area of research in the field of computer vision, enabling the generation of new images with different styles, textures, or resolutions while preserving their characteristic properties. Recent architectures leverage Generative Adversarial Networks (GANs) to transform input images from one domain to another. In this work, we focus on the study of both paired and unpaired image translation across multiple image domains. For the paired task, we used a conditional GAN model, and for the unpaired task, we trained it using cycle consistency loss. We experimented with different types of loss functions, multiple Patch-GAN sizes, and model architectures. New quantitative metrics - precision, recall, and FID score - were used for analysis. In addition, a qualitative study of the results of different experiments was conducted.
