Table of Contents
Fetching ...

Paired and Unpaired Image to Image Translation using Generative Adversarial Networks

Gaurav Kumar, Soham Satyadharma, Harpreet Singh

TL;DR

The paper tackles image-to-image translation in both paired and unpaired settings using GAN-based frameworks. It adopts Pix2Pix for paired translation and CycleGAN for unpaired translation, systematically evaluating loss functions ($L_{cGAN}$, $L_{L1}$, $L_{L2}$) and PatchGAN configurations, guided by metrics including precision, recall, and $\mathrm{FID}$. Key findings show that $L1$ loss yields the strongest performance in paired tasks and that paired translations outperform unpaired ones, with $70\times70$ PatchGAN generally outperforming larger patches in some aspects. The work offers a unified approach to cross-domain translation across multiple datasets and provides comprehensive quantitative and qualitative analyses to inform practical deployments.

Abstract

Image to image translation is an active area of research in the field of computer vision, enabling the generation of new images with different styles, textures, or resolutions while preserving their characteristic properties. Recent architectures leverage Generative Adversarial Networks (GANs) to transform input images from one domain to another. In this work, we focus on the study of both paired and unpaired image translation across multiple image domains. For the paired task, we used a conditional GAN model, and for the unpaired task, we trained it using cycle consistency loss. We experimented with different types of loss functions, multiple Patch-GAN sizes, and model architectures. New quantitative metrics - precision, recall, and FID score - were used for analysis. In addition, a qualitative study of the results of different experiments was conducted.

Paired and Unpaired Image to Image Translation using Generative Adversarial Networks

TL;DR

The paper tackles image-to-image translation in both paired and unpaired settings using GAN-based frameworks. It adopts Pix2Pix for paired translation and CycleGAN for unpaired translation, systematically evaluating loss functions (, , ) and PatchGAN configurations, guided by metrics including precision, recall, and . Key findings show that loss yields the strongest performance in paired tasks and that paired translations outperform unpaired ones, with PatchGAN generally outperforming larger patches in some aspects. The work offers a unified approach to cross-domain translation across multiple datasets and provides comprehensive quantitative and qualitative analyses to inform practical deployments.

Abstract

Image to image translation is an active area of research in the field of computer vision, enabling the generation of new images with different styles, textures, or resolutions while preserving their characteristic properties. Recent architectures leverage Generative Adversarial Networks (GANs) to transform input images from one domain to another. In this work, we focus on the study of both paired and unpaired image translation across multiple image domains. For the paired task, we used a conditional GAN model, and for the unpaired task, we trained it using cycle consistency loss. We experimented with different types of loss functions, multiple Patch-GAN sizes, and model architectures. New quantitative metrics - precision, recall, and FID score - were used for analysis. In addition, a qualitative study of the results of different experiments was conducted.

Paper Structure

This paper contains 19 sections, 9 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: From top to bottom: Input, ground truth, and generated images from paired L1 loss experiments on facades, maps, and cityscapes datasets
  • Figure 2: From top to bottom: input and generated images from unpaired L1 cyclic loss experiments on facades and horse2zebra datasets.
  • Figure 3: Loss curves for the L1 loss experiment on the maps dataset with batch size=16 for 150 epochs
  • Figure 4: Generation results using different methods. From left to right and top to bottom: Input, Ground Truth, L1 cyclic loss, L2 cyclic loss, Patch 16, Patch 286, Skip, and Pix2Pix L2 loss.