Table of Contents
Fetching ...

REGEN: Real-Time Photorealism Enhancement in Games via a Dual-Stage Generative Network Framework

Stefanos Pasios, Nikos Nikolaidis

TL;DR

REGEN tackles real-time photorealism enhancement in games by coupling a robust unpaired Im2Im translation that offline-generates semantically aligned photorealistic pairs with a lightweight paired Im2Im model for in-game inference. The framework is engine-friendly, avoiding runtime G-Buffer dependencies and integrating via ONNX Runtime for real-time deployment in Unreal Engine and Unity. Empirical results show comparable or improved perceptual quality (CMMD) with up to a 12x FPS increase, while maintaining semantic integrity and temporal coherence across diverse datasets and engine versions. This work offers a practical path toward plug-and-play photorealism improvements in modern game pipelines and points to future work on specialized, ultra-fast paired translators.

Abstract

Photorealism is an important aspect of modern video games since it can shape player experience and impact immersion, narrative engagement, and visual fidelity. To achieve photorealism, beyond traditional rendering pipelines, generative models have been increasingly adopted as an effective approach for bridging the gap between the visual realism of synthetic and real worlds. However, under real-time constraints of video games, existing generative approaches continue to face a tradeoff between visual quality and runtime efficiency. In this work, we present a framework for enhancing the photorealism of rendered game frames using generative networks. We propose REGEN, which first employs a robust unpaired image-to-image translation model to generate semantically consistent photorealistic frames. These generated frames are then used to create a paired dataset, which transforms the problem to a simpler unpaired image-to-image translation. This enables training with a lightweight method, achieving real-time inference without compromising visual quality. We evaluate REGEN on Unreal Engine, showing, by employing the CMMD metric, that it achieves comparable or slightly improved visual quality compared to the robust method, while improving the frame rate by 12x. Additional experiments also validate that REGEN adheres to the semantic preservation of the initial robust image-to-image translation method and maintains temporal consistency. Code, pre-trained models, and demos for this work are available at: https://github.com/stefanos50/REGEN

REGEN: Real-Time Photorealism Enhancement in Games via a Dual-Stage Generative Network Framework

TL;DR

REGEN tackles real-time photorealism enhancement in games by coupling a robust unpaired Im2Im translation that offline-generates semantically aligned photorealistic pairs with a lightweight paired Im2Im model for in-game inference. The framework is engine-friendly, avoiding runtime G-Buffer dependencies and integrating via ONNX Runtime for real-time deployment in Unreal Engine and Unity. Empirical results show comparable or improved perceptual quality (CMMD) with up to a 12x FPS increase, while maintaining semantic integrity and temporal coherence across diverse datasets and engine versions. This work offers a practical path toward plug-and-play photorealism improvements in modern game pipelines and points to future work on specialized, ultra-fast paired translators.

Abstract

Photorealism is an important aspect of modern video games since it can shape player experience and impact immersion, narrative engagement, and visual fidelity. To achieve photorealism, beyond traditional rendering pipelines, generative models have been increasingly adopted as an effective approach for bridging the gap between the visual realism of synthetic and real worlds. However, under real-time constraints of video games, existing generative approaches continue to face a tradeoff between visual quality and runtime efficiency. In this work, we present a framework for enhancing the photorealism of rendered game frames using generative networks. We propose REGEN, which first employs a robust unpaired image-to-image translation model to generate semantically consistent photorealistic frames. These generated frames are then used to create a paired dataset, which transforms the problem to a simpler unpaired image-to-image translation. This enables training with a lightweight method, achieving real-time inference without compromising visual quality. We evaluate REGEN on Unreal Engine, showing, by employing the CMMD metric, that it achieves comparable or slightly improved visual quality compared to the robust method, while improving the frame rate by 12x. Additional experiments also validate that REGEN adheres to the semantic preservation of the initial robust image-to-image translation method and maintains temporal consistency. Code, pre-trained models, and demos for this work are available at: https://github.com/stefanos50/REGEN

Paper Structure

This paper contains 17 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview of the REGEN framework, divided into four main phases: a) collection of an unpaired set that includes video game and real-world data, b) training of a robust unpaired Im2Im translation network to produce a semantically consistent photorealism-enhanced version of the game dataset, c) training of a lightweight paired Im2Im translation method between the video game and the photorealism-enhanced datasets, and d) final integration in the game.
  • Figure 2: Visual comparison of the images generated by EPE and REGEN towards the characteristics of Cityscapes and KITTI, when given as input a CARLA2Real-UE4 frame from the test set.
  • Figure 3: Example failure cases of EPE, where unrealistic glossiness and material artifacts appear on vehicle surfaces compared to the initial render and the results produced by the proposed method.
  • Figure 4: Translation result of REGEN on a CrowdFlow frame (left) towards the characteristics of KITTI (right).
  • Figure 5: Visual comparison of the images generated by CUT, MUNIT, Color Transfer, and REGEN, when given as input a PFD frame from the test set.