Table of Contents
Fetching ...

HDR-NSFF: High Dynamic Range Neural Scene Flow Fields

Shin Dong-Yeon, Kim Jun-Seong, Kwon Byung-Ki, Tae-Hyun Oh

TL;DR

HDR-NSFF is presented, a paradigm shift from 2D-based merging to 4D spatio-temporal modeling, and the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes is presented, presenting the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes.

Abstract

Radiance of real-world scenes typically spans a much wider dynamic range than what standard cameras can capture. While conventional HDR methods merge alternating-exposure frames, these approaches are inherently constrained to 2D pixel-level alignment, often leading to ghosting artifacts and temporal inconsistency in dynamic scenes. To address these limitations, we present HDR-NSFF, a paradigm shift from 2D-based merging to 4D spatio-temporal modeling. Our framework reconstructs dynamic HDR radiance fields from alternating-exposure monocular videos by representing the scene as a continuous function of space and time, and is compatible with both neural radiance field and 4D Gaussian Splatting (4DGS) based dynamic representations. This unified end-to-end pipeline explicitly models HDR radiance, 3D scene flow, geometry, and tone-mapping, ensuring physical plausibility and global coherence. We further enhance robustness by (i) extending semantic-based optical flow with DINO features to achieve exposure-invariant motion estimation, and (ii) incorporating a generative prior as a regularizer to compensate for limited observation in monocular captures and saturation-induced information loss. To evaluate HDR space-time view synthesis, we present the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes. Experiments demonstrate that HDR-NSFF recovers fine radiance details and coherent dynamics even under challenging exposure variations, thereby achieving state-of-the-art performance in novel space-time view synthesis. Project page: https://shin-dong-yeon.github.io/HDR-NSFF/

HDR-NSFF: High Dynamic Range Neural Scene Flow Fields

TL;DR

HDR-NSFF is presented, a paradigm shift from 2D-based merging to 4D spatio-temporal modeling, and the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes is presented, presenting the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes.

Abstract

Radiance of real-world scenes typically spans a much wider dynamic range than what standard cameras can capture. While conventional HDR methods merge alternating-exposure frames, these approaches are inherently constrained to 2D pixel-level alignment, often leading to ghosting artifacts and temporal inconsistency in dynamic scenes. To address these limitations, we present HDR-NSFF, a paradigm shift from 2D-based merging to 4D spatio-temporal modeling. Our framework reconstructs dynamic HDR radiance fields from alternating-exposure monocular videos by representing the scene as a continuous function of space and time, and is compatible with both neural radiance field and 4D Gaussian Splatting (4DGS) based dynamic representations. This unified end-to-end pipeline explicitly models HDR radiance, 3D scene flow, geometry, and tone-mapping, ensuring physical plausibility and global coherence. We further enhance robustness by (i) extending semantic-based optical flow with DINO features to achieve exposure-invariant motion estimation, and (ii) incorporating a generative prior as a regularizer to compensate for limited observation in monocular captures and saturation-induced information loss. To evaluate HDR space-time view synthesis, we present the first real-world HDR-GoPro dataset specifically designed for dynamic HDR scenes. Experiments demonstrate that HDR-NSFF recovers fine radiance details and coherent dynamics even under challenging exposure variations, thereby achieving state-of-the-art performance in novel space-time view synthesis. Project page: https://shin-dong-yeon.github.io/HDR-NSFF/
Paper Structure (52 sections, 32 equations, 28 figures, 11 tables)

This paper contains 52 sections, 32 equations, 28 figures, 11 tables.

Figures (28)

  • Figure 1: High Dynamic Range Neural Scene Flow Fields (HDR-NSFF) reconstruct dynamic HDR radiance field from (a) alternating-exposure monocular videos. Our method enables the rendering of (b) HDR novel views across both spatial and temporal domains. Additionally, we can generate (c) novel LDR views along with their corresponding depth maps.
  • Figure 2: Comparison of HDR video reconstruction on training views. Given alternating-exposure video, HDR video reconstruction baselines, i.e., LAN-HDR chung2023lanhdr, HDRFlow xu2024hdrflow, and NECHDR cui2024exposure fail to produce consistent results, while our model ensures temporal coherence and recovers valid information in saturated regions.
  • Figure 3: Overall pipeline of our proposed method. HDR-NSFF takes an alternating-exposure monocular video as input and estimates 3D scene flow for the sampled points along each ray. Neighboring frames are then warped to render the HDR radiance at the target frame, which is tone-mapped to LDR via a white-balance and camera-response function module. Photometric loss with the ground-truth LDR images, along with optical flow and depth constraints from off-the-shelf models, jointly optimize both the scene flow fields and tone-mapping module in an end-to-end manner.
  • Figure 4: Visualization of flow estimation under varying exposure conditions. RAFT yields noticeable errors when exposures vary. Even with gamma correction, RAFT (Gamma) still fails to produce accurate results. While fine-tuning on synthetic data RAFT (Finetuned) shows moderate improvement, our proposed semantic-based approach achieves superior accuracy.
  • Figure 5: Generative prior (GP) pipeline for HDR dynamic radiance fields optimization. Unseen novel views are first rendered and then refined via a GP to restore details in regions with broken correspondences. These enhanced views serve as pseudo-labels for the progressive optimization of HDR dynamic radiance fields. This bootstrapping mechanism stabilizes correspondences, mitigating issues related to sparse viewpoints and saturated pixels in monocular videos with alternating exposures.
  • ...and 23 more figures