Table of Contents
Fetching ...

GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction

Hrishav Bakul Barua, Kalin Stefanov, KokSheik Wong, Abhinav Dhall, Ganesh Krishnasamy

TL;DR

HDR reconstruction from LDR remains challenging due to limited diverse ground-truth datasets. GTA-HDR provides a large-scale synthetic HDR dataset sourced from GTA-V, featuring 40K ground-truth HDR images, multi-resolution HDR/LDR pairs, 1M LDR variants via exposure and contrast manipulations, and 40K distorted HDR samples, along with a data collection pipeline and evaluation code. Thorough experiments show GTA-HDR improves state-of-the-art HDR reconstruction methods and enhances generalization, while also boosting performance in downstream tasks such as 3D human pose estimation and semantic segmentation when used as pre-processing. The dataset enables no-reference HDR quality assessment development and broadens the evaluative and application landscape for HDR imaging in computer vision.

Abstract

High Dynamic Range (HDR) content (i.e., images and videos) has a broad range of applications. However, capturing HDR content from real-world scenes is expensive and time-consuming. Therefore, the challenging task of reconstructing visually accurate HDR images from their Low Dynamic Range (LDR) counterparts is gaining attention in the vision research community. A major challenge in this research problem is the lack of datasets, which capture diverse scene conditions (e.g., lighting, shadows, weather, locations, landscapes, objects, humans, buildings) and various image features (e.g., color, contrast, saturation, hue, luminance, brightness, radiance). To address this gap, in this paper, we introduce GTA-HDR, a large-scale synthetic dataset of photo-realistic HDR images sampled from the GTA-V video game. We perform thorough evaluation of the proposed dataset, which demonstrates significant qualitative and quantitative improvements of the state-of-the-art HDR image reconstruction methods. Furthermore, we demonstrate the effectiveness of the proposed dataset and its impact on additional computer vision tasks including 3D human pose estimation, human body part segmentation, and holistic scene segmentation. The dataset, data collection pipeline, and evaluation code are available at: https://github.com/HrishavBakulBarua/GTA-HDR.

GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction

TL;DR

HDR reconstruction from LDR remains challenging due to limited diverse ground-truth datasets. GTA-HDR provides a large-scale synthetic HDR dataset sourced from GTA-V, featuring 40K ground-truth HDR images, multi-resolution HDR/LDR pairs, 1M LDR variants via exposure and contrast manipulations, and 40K distorted HDR samples, along with a data collection pipeline and evaluation code. Thorough experiments show GTA-HDR improves state-of-the-art HDR reconstruction methods and enhances generalization, while also boosting performance in downstream tasks such as 3D human pose estimation and semantic segmentation when used as pre-processing. The dataset enables no-reference HDR quality assessment development and broadens the evaluative and application landscape for HDR imaging in computer vision.

Abstract

High Dynamic Range (HDR) content (i.e., images and videos) has a broad range of applications. However, capturing HDR content from real-world scenes is expensive and time-consuming. Therefore, the challenging task of reconstructing visually accurate HDR images from their Low Dynamic Range (LDR) counterparts is gaining attention in the vision research community. A major challenge in this research problem is the lack of datasets, which capture diverse scene conditions (e.g., lighting, shadows, weather, locations, landscapes, objects, humans, buildings) and various image features (e.g., color, contrast, saturation, hue, luminance, brightness, radiance). To address this gap, in this paper, we introduce GTA-HDR, a large-scale synthetic dataset of photo-realistic HDR images sampled from the GTA-V video game. We perform thorough evaluation of the proposed dataset, which demonstrates significant qualitative and quantitative improvements of the state-of-the-art HDR image reconstruction methods. Furthermore, we demonstrate the effectiveness of the proposed dataset and its impact on additional computer vision tasks including 3D human pose estimation, human body part segmentation, and holistic scene segmentation. The dataset, data collection pipeline, and evaluation code are available at: https://github.com/HrishavBakulBarua/GTA-HDR.
Paper Structure (32 sections, 6 figures, 5 tables)

This paper contains 32 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: GTA-HDR dataset collection pipeline. See Section \ref{['sec:dataset_collection']} for detailed description of the three steps; GT: Ground truth; Dis: Distorted; EV: Exposure value; CL: Contrast level. Note: The GTA-V logo is retrieved from Google Images.
  • Figure 2: GTA-HDR dataset scene diversity. Samples from the GTA-HDR dataset with multiple variations in location, weather, objects and time. The scene diversity ensures a thorough coverage of pixel colors, brightness, and luminance.
  • Figure 3: GTA-HDR dataset image diversity. Samples from GTA-HDR dataset with multiple exposure values, contrast levels and their combinations. For any image-to-image translation dataset, it is important to include a sufficient samples with diverse range of color hues, saturation, exposure, and contrast levels.
  • Figure 4: HDR images reconstructed with and without GTA-HDR as part of the training dataset, along with the RGB histograms and KL-divergence values.Base: HDR images reconstructed with ArtHDR-Net barua2023arthdr trained without GTA-HDR data; Ours: HDR images reconstructed with ArtHDR-Net barua2023arthdr trained with GTA-HDR data; GT: Ground truth.
  • Figure 5: Feature space covered by different conventional HDR image reconstruction datasets. We used UMAP mcinnes2018umap dimension reduction technique to visualize the features extracted from the most common pre-trained feature extraction backbones. Real Datasets: Real data combines the datasets proposed in kalantari2017deepprabhakar2019fastjang2020dynamic; City Scene: Mixed datasets proposed in zhang2017learninghdrsky; HDR-Synth & HDR-Real: Mixed dataset proposed in liu2020single; GTA-HDR: Proposed synthetic dataset.
  • ...and 1 more figures