Table of Contents
Fetching ...

PatchEX: High-Quality Real-Time Temporal Supersampling through Patch-based Parallel Extrapolation

Akanksha Dixit, Smruti R. Sarangi

TL;DR

PatchEX addresses the challenge of achieving interpolation-like visual quality with the low latency of extrapolation for real-time temporal supersampling. It partitions the extrapolation task into three spatially informed patches (foreground, near-background, far-background) and processes them in parallel using two specialized, lightweight networks, while separately extrapolating shadows and blending results with G-buffer-guided warping. The approach is complemented by foveated segmentation, LBP features, and perceptual losses, and validated on a diverse Unreal Engine 5.1 dataset showing substantial PSNR gains (65.29% over ExtraNet, 48.46% over ExtraSS) and significant latency reductions, including sub-3 ms per-frame inference at 1080p. These results demonstrate PatchEX’s potential to deliver high-quality temporal supersampling in real-time applications such as gaming and medical visualization on consumer GPUs. The method’s explicit handling of disocclusions and shading, combined with a scalable high-resolution performance, suggests practical impact for GPU rendering pipelines seeking smoother frame delivery without sacrificing visual fidelity.

Abstract

High-refresh rate displays have become very popular in recent years due to the need for superior visual quality in gaming, professional displays and specialized applications like medical imaging. However, high-refresh rate displays alone do not guarantee a superior visual experience; the GPU needs to render frames at a matching rate. Otherwise, we observe disconcerting visual artifacts such as screen tearing and stuttering. Temporal supersampling is an effective technique to increase frame rates by predicting new frames from other rendered frames. There are two methods in this space: interpolation and extrapolation. Interpolation-based methods provide good image quality at the cost of a higher latency because they also require the next rendered frame. On the other hand, extrapolation methods are much faster at the cost of quality. This paper introduces PatchEX, a novel frame extrapolation method that aims to provide the quality of interpolation at the speed of extrapolation. It smartly partitions the extrapolation task into sub-tasks and executes them in parallel to improve both quality and latency. It then uses a patch-based inpainting method and a custom shadow prediction approach to fuse the generated sub-frames. This approach significantly reduces the overall latency while maintaining the quality of the output. Our results demonstrate that PatchEX achieves a 65.29% and 48.46% improvement in PSNR over the latest extrapolation methods ExtraNet and ExtraSS, respectively, while being 6x and 2x faster, respectively.

PatchEX: High-Quality Real-Time Temporal Supersampling through Patch-based Parallel Extrapolation

TL;DR

PatchEX addresses the challenge of achieving interpolation-like visual quality with the low latency of extrapolation for real-time temporal supersampling. It partitions the extrapolation task into three spatially informed patches (foreground, near-background, far-background) and processes them in parallel using two specialized, lightweight networks, while separately extrapolating shadows and blending results with G-buffer-guided warping. The approach is complemented by foveated segmentation, LBP features, and perceptual losses, and validated on a diverse Unreal Engine 5.1 dataset showing substantial PSNR gains (65.29% over ExtraNet, 48.46% over ExtraSS) and significant latency reductions, including sub-3 ms per-frame inference at 1080p. These results demonstrate PatchEX’s potential to deliver high-quality temporal supersampling in real-time applications such as gaming and medical visualization on consumer GPUs. The method’s explicit handling of disocclusions and shading, combined with a scalable high-resolution performance, suggests practical impact for GPU rendering pipelines seeking smoother frame delivery without sacrificing visual fidelity.

Abstract

High-refresh rate displays have become very popular in recent years due to the need for superior visual quality in gaming, professional displays and specialized applications like medical imaging. However, high-refresh rate displays alone do not guarantee a superior visual experience; the GPU needs to render frames at a matching rate. Otherwise, we observe disconcerting visual artifacts such as screen tearing and stuttering. Temporal supersampling is an effective technique to increase frame rates by predicting new frames from other rendered frames. There are two methods in this space: interpolation and extrapolation. Interpolation-based methods provide good image quality at the cost of a higher latency because they also require the next rendered frame. On the other hand, extrapolation methods are much faster at the cost of quality. This paper introduces PatchEX, a novel frame extrapolation method that aims to provide the quality of interpolation at the speed of extrapolation. It smartly partitions the extrapolation task into sub-tasks and executes them in parallel to improve both quality and latency. It then uses a patch-based inpainting method and a custom shadow prediction approach to fuse the generated sub-frames. This approach significantly reduces the overall latency while maintaining the quality of the output. Our results demonstrate that PatchEX achieves a 65.29% and 48.46% improvement in PSNR over the latest extrapolation methods ExtraNet and ExtraSS, respectively, while being 6x and 2x faster, respectively.
Paper Structure (41 sections, 8 equations, 21 figures, 12 tables, 1 algorithm)

This paper contains 41 sections, 8 equations, 21 figures, 12 tables, 1 algorithm.

Figures (21)

  • Figure 1: The solution space for temporal supersampling. Each solution is run on an NVIDIA RTX 4090 GPU. The detailed system configuration is shown in Table \ref{['tab:config']}.
  • Figure 2: Interpolation and extrapolation explained. $F_i$ is the rendered frame. $R$ and $D$ represent the rendering time and refresh latency, respectively.
  • Figure 3: Example views from a few sample scenes
  • Figure 4: Variation in the total rendering time
  • Figure 5: Top 10 high-latency steps in the rendering process
  • ...and 16 more figures