Image-to-Image Translation: Methods and Applications
Yingxue Pang, Jianxin Lin, Tao Qin, Zhibo Chen
TL;DR
This survey comprehensively catalogs image-to-image translation techniques, organizing methods into two-domain and multi-domain settings and distinguishing supervised, unsupervised, semi-supervised, and few-shot paradigms. It analyzes core generative backbones (VAEs and GANs), details stabilization strategies, and enumerates objective and subjective evaluation metrics. The work inventories a broad array of models—from pix2pix and CycleGAN to SPADE, StarGANv2, and CoCosNet—across diverse applications, and discusses practical considerations such as data requirements, multimodal outputs, and domain adaptation. By mapping algorithmic advances to concrete tasks and datasets, the paper highlights both the progress and challenges in producing high-fidelity, diverse translations at scale. The overall contribution is a consolidated reference for researchers and practitioners to navigate I2I methods and their applications, and to identify gaps for future work.
Abstract
Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations. I2I has drawn increasing attention and made tremendous progress in recent years because of its wide range of applications in many computer vision and image processing problems, such as image synthesis, segmentation, style transfer, restoration, and pose estimation. In this paper, we provide an overview of the I2I works developed in recent years. We will analyze the key techniques of the existing I2I works and clarify the main progress the community has made. Additionally, we will elaborate on the effect of I2I on the research and industry community and point out remaining challenges in related fields.
