Table of Contents
Fetching ...

Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion

Gang He, Kepeng Xu, Li Xu, Siqi Wang, Wenxin Yu, Xianyun Wu

TL;DR

The paper tackles the ill-posed problem of converting SDRTV to HDRTV in real-world content. It introduces a prior-guided framework, RealHDRTVNet, that leverages real HDRTV priors through HDRTV-VQGAN to constrain the solution space and guide SDRTV-to-HDRTV mapping. The method comprises three phases: HDRTV-VQGAN to learn priors, an SDRTV Modulation Encoder to align SDRTV features with those priors, and RealHDRTVNet with HDR Color Alignment (HCA) and SDR Texture Alignment (STA) to produce high-quality HDR outputs. Across synthetic and real datasets, the approach yields improvements in objective metrics (PSNR, SSIM, $\Delta E_{ITP}$, HDRVDP3) and perceptual quality measures (LPHPS, NHQE, FHAD), demonstrating enhanced generalization and practical impact for HDRTV restoration.

Abstract

The rise of HDR-WCG display devices has highlighted the need to convert SDRTV to HDRTV, as most video sources are still in SDR. Existing methods primarily focus on designing neural networks to learn a single-style mapping from SDRTV to HDRTV. However, the limited information in SDRTV and the diversity of styles in real-world conversions render this process an ill-posed problem, thereby constraining the performance and generalization of these methods. Inspired by generative approaches, we propose a novel method for SDRTV to HDRTV conversion guided by real HDRTV priors. Despite the limited information in SDRTV, introducing real HDRTV as reference priors significantly constrains the solution space of the originally high-dimensional ill-posed problem. This shift transforms the task from solving an unreferenced prediction problem to making a referenced selection, thereby markedly enhancing the accuracy and reliability of the conversion process. Specifically, our approach comprises two stages: the first stage employs a Vector Quantized Generative Adversarial Network to capture HDRTV priors, while the second stage matches these priors to the input SDRTV content to recover realistic HDRTV outputs. We evaluate our method on public datasets, demonstrating its effectiveness with significant improvements in both objective and subjective metrics across real and synthetic datasets.

Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion

TL;DR

The paper tackles the ill-posed problem of converting SDRTV to HDRTV in real-world content. It introduces a prior-guided framework, RealHDRTVNet, that leverages real HDRTV priors through HDRTV-VQGAN to constrain the solution space and guide SDRTV-to-HDRTV mapping. The method comprises three phases: HDRTV-VQGAN to learn priors, an SDRTV Modulation Encoder to align SDRTV features with those priors, and RealHDRTVNet with HDR Color Alignment (HCA) and SDR Texture Alignment (STA) to produce high-quality HDR outputs. Across synthetic and real datasets, the approach yields improvements in objective metrics (PSNR, SSIM, , HDRVDP3) and perceptual quality measures (LPHPS, NHQE, FHAD), demonstrating enhanced generalization and practical impact for HDRTV restoration.

Abstract

The rise of HDR-WCG display devices has highlighted the need to convert SDRTV to HDRTV, as most video sources are still in SDR. Existing methods primarily focus on designing neural networks to learn a single-style mapping from SDRTV to HDRTV. However, the limited information in SDRTV and the diversity of styles in real-world conversions render this process an ill-posed problem, thereby constraining the performance and generalization of these methods. Inspired by generative approaches, we propose a novel method for SDRTV to HDRTV conversion guided by real HDRTV priors. Despite the limited information in SDRTV, introducing real HDRTV as reference priors significantly constrains the solution space of the originally high-dimensional ill-posed problem. This shift transforms the task from solving an unreferenced prediction problem to making a referenced selection, thereby markedly enhancing the accuracy and reliability of the conversion process. Specifically, our approach comprises two stages: the first stage employs a Vector Quantized Generative Adversarial Network to capture HDRTV priors, while the second stage matches these priors to the input SDRTV content to recover realistic HDRTV outputs. We evaluate our method on public datasets, demonstrating its effectiveness with significant improvements in both objective and subjective metrics across real and synthetic datasets.

Paper Structure

This paper contains 27 sections, 10 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) Previous methods learn single-style SDRTV-to-HDRTV conversion on a single dataset. However, the SDRTV-to-HDRTV conversion distribution in real-world scenarios is complex and diverse, which makes it difficult for previous methods to effectively convert SDRTV-to-HDRTV conversion in the real-world. (b) Our method embeds rich and realistic HDRTV into the converted neural network, thereby greatly improving the conversion performance in real scenes. (c) The latent variable distribution of our method is closer to GT due to the incorporation of real HDRTV prior guidance.
  • Figure 2: RealHDRTVNet framework. (a) HDRTV-VQGAN. We first pre-train an HDRTV-VQGAN to learn to store HDRTV priors through self-reconstruction. (b)RealHDRTVNet. The learning modulation encoder $E_{sfm}$ obtains "nearly high-quality HDRTV features". Next, the HDR Color Alignment HCA module aligns the input features with HDRTV in the color dynamic range dimension. In addition, the SDR Texture Alignment STA module is used to align the texture with the input SDRTV. This makes the dynamic range information of the conversion result consistent with HDRTV, and the texture details consistent with SDRTV.
  • Figure 3: Qualitative results on synthetic datasets. Our RealHDRTVNet can recover realistic HDRTV color information through embedded real-world HDRTV priors. (Zoom in for details)
  • Figure 4: Visual ablation. The HDR Color Alignment $HCA$ module and the SDR Texture Alignment $STA$ module are added.