Table of Contents
Fetching ...

HyPER-GAN: Hybrid Patch-Based Image-to-Image Translation for Real-Time Photorealism Enhancement

Stefanos Pasios, Nikos Nikolaidis

TL;DR

Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness.

Abstract

Generative models are widely employed to enhance the photorealism of synthetic data for training computer vision algorithms. However, they often introduce visual artifacts that degrade the accuracy of these algorithms and require high computational resources, limiting their applicability in real-time training or evaluation scenarios. In this paper, we propose Hybrid Patch Enhanced Realism Generative Adversarial Network (HyPER-GAN), a lightweight image-to-image translation method based on a U-Net-style generator designed for real-time inference. The model is trained using paired synthetic and photorealism-enhanced images, complemented by a hybrid training strategy that incorporates matched patches from real-world data to improve visual realism and semantic consistency. Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness. Moreover, it is illustrated that the proposed hybrid training strategy indeed improves visual quality and semantic consistency compared to training the model solely with paired synthetic and photorealism-enhanced images. Code and pretrained models are publicly available for download at: https://github.com/stefanos50/HyPER-GAN

HyPER-GAN: Hybrid Patch-Based Image-to-Image Translation for Real-Time Photorealism Enhancement

TL;DR

Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness.

Abstract

Generative models are widely employed to enhance the photorealism of synthetic data for training computer vision algorithms. However, they often introduce visual artifacts that degrade the accuracy of these algorithms and require high computational resources, limiting their applicability in real-time training or evaluation scenarios. In this paper, we propose Hybrid Patch Enhanced Realism Generative Adversarial Network (HyPER-GAN), a lightweight image-to-image translation method based on a U-Net-style generator designed for real-time inference. The model is trained using paired synthetic and photorealism-enhanced images, complemented by a hybrid training strategy that incorporates matched patches from real-world data to improve visual realism and semantic consistency. Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness. Moreover, it is illustrated that the proposed hybrid training strategy indeed improves visual quality and semantic consistency compared to training the model solely with paired synthetic and photorealism-enhanced images. Code and pretrained models are publicly available for download at: https://github.com/stefanos50/HyPER-GAN
Paper Structure (16 sections, 6 equations, 7 figures, 4 tables)

This paper contains 16 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview of the HyPER-GAN method, which includes four phases: a) datasets and preprocessing, b) real-world dataset indexing, c) training, and d) photorealism enhancement.
  • Figure 2: Examples of matched patches between the generated (top) and the real-world (bottom) patches.
  • Figure 3: Translation results of PFD (GTA-V) towards real-world datasets (CS and MV) produced by a) EPE and b) HyPER-GAN.
  • Figure 4: Translation results of PFD (GTA-V) towards the real-world datasets CS and MV produced by a) FastCUT, b) REGEN, and c) HyPER-GAN.
  • Figure 5: Comparison of images generated by (b) COSMOS Transfer1 and (c) HyPER-GAN, given as input an image (a) from the PFB dataset.
  • ...and 2 more figures