Table of Contents
Fetching ...

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Cheng-De Fan, Chen-Wei Chang, Yi-Ruei Liu, Jie-Ying Lee, Jiun-Long Huang, Yu-Chee Tseng, Yu-Lun Liu

TL;DR

SpectroMotion addresses the challenge of reconstructing and rendering dynamic scenes with strong specular reflections by marrying 3D Gaussian Splatting with physically based rendering and deformation fields. It introduces a residual normal estimation technique during deformation, a deformable environment map for time-varying lighting, and a coarse-to-fine training pipeline that substantially improves geometry and per-Gaussian specular color prediction. The method demonstrates state-of-the-art view synthesis for real-world dynamic specular scenes, outperforming prior NeRF- and 3DGS-based approaches on real datasets and achieving real-time rendering for moderately complex scenes. This work significantly advances practical 3D scene reconstruction under dynamic lighting and specular conditions, enabling more faithful novel-view synthesis in challenging real-world scenarios.

Abstract

We present SpectroMotion, a novel approach that combines 3D Gaussian Splatting (3DGS) with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes. Previous methods extending 3DGS to model dynamic scenes have struggled to represent specular surfaces accurately. Our method addresses this limitation by introducing a residual correction technique for accurate surface normal computation during deformation, complemented by a deformable environment map that adapts to time-varying lighting conditions. We implement a coarse-to-fine training strategy significantly enhancing scene geometry and specular color prediction. It is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes, outperforming state-of-the-art methods in rendering complex, dynamic, and specular scenes.

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

TL;DR

SpectroMotion addresses the challenge of reconstructing and rendering dynamic scenes with strong specular reflections by marrying 3D Gaussian Splatting with physically based rendering and deformation fields. It introduces a residual normal estimation technique during deformation, a deformable environment map for time-varying lighting, and a coarse-to-fine training pipeline that substantially improves geometry and per-Gaussian specular color prediction. The method demonstrates state-of-the-art view synthesis for real-world dynamic specular scenes, outperforming prior NeRF- and 3DGS-based approaches on real datasets and achieving real-time rendering for moderately complex scenes. This work significantly advances practical 3D scene reconstruction under dynamic lighting and specular conditions, enabling more faithful novel-view synthesis in challenging real-world scenarios.

Abstract

We present SpectroMotion, a novel approach that combines 3D Gaussian Splatting (3DGS) with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes. Previous methods extending 3DGS to model dynamic scenes have struggled to represent specular surfaces accurately. Our method addresses this limitation by introducing a residual correction technique for accurate surface normal computation during deformation, complemented by a deformable environment map that adapts to time-varying lighting conditions. We implement a coarse-to-fine training strategy significantly enhancing scene geometry and specular color prediction. It is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes, outperforming state-of-the-art methods in rendering complex, dynamic, and specular scenes.

Paper Structure

This paper contains 29 sections, 11 equations, 17 figures, 8 tables.

Figures (17)

  • Figure 1: Our method, SpectroMotion, recovers and renders dynamic scenes with higher-quality reflections compared to prior work. It introduces physical normal estimation, deformable environment maps, and a coarse-to-fine training strategy to achieve superior results in rendering dynamic scenes with reflections. Here, we present a rendered test image, corresponding normal maps, and a ground-truth image, where the ground-truth normal map (used as a reference) is generated using a pre-trained normal estimator eftekhar2021omnidata. For Deformable 3DGS, we use the shortest axes of the deformed 3D Gaussians as the normals. We have highlighted the specular regions to demonstrate the effectiveness of our approach.
  • Figure 2: Method Overview. Our method stabilizes the scene geometry through three stages. In the static stage, we stabilize the geometry of the static scene by minimizing photometric loss $\mathcal{L}_{\text{color}}$ between vanilla 3DGS renders and ground truth images. The dynamic stage combines canonical 3D Gaussians $\textbf{G}$ with a deformable Gaussian MLP to model dynamic scenes while simultaneously minimizing normal loss $\mathcal{L}_{\text{normal}}$ between rendered normal map $\mathbf{N}^t$ and gradient normal map from depth map ${\mathbf{D}^t}$, thus further enhancing the overall scene geometry. Finally, the specular stage introduces a deformable reflection MLP to handle changing environment lighting, deforming reflection directions $\omega^t_r$ to query a canonical environment map for specular color $\mathbf{c}_s^t$. It is then combined with diffuse color $\mathbf{c_d}$ (using zero-order spherical harmonics) and learnable specular tint $\mathbf{s_\mathbf{tint}}$ per 3D Gaussian to obtain the final color $\mathbf{c}_\mathbf{final}^t$. This approach enables the modeling of dynamic specular scenes and high-quality novel view rendering.
  • Figure 3: Normal estimation.(a) shows that flatter 3D Gaussians align better with scene surfaces, their shortest axis closely matching the surface normal. In contrast, less flat 3D Gaussians fit less accurately, with their shortest axis diverging from the surface normal. (b) shows that when the deformed 3D Gaussian becomes flatter ($t=t_1$), normal residual $\Delta\mathbf{n}$ is rotated by $\mathbf{R}^t_1$ and scaled down by $\frac{\beta}{\beta^t_1}$, as flatter Gaussians require smaller normal residuals. Conversely, when the deformation results in a less flat shape ($t=t_2$), $\Delta\mathbf{n}$ is rotated by $\mathbf{R}^t_2$ and amplified by $\frac{\beta}{\beta^t_2}$, requiring a larger correction to align the shortest axis with the surface normal. (c) shows how $\gamma^k$ changes with $w$ (where $w = \frac{|\mathbf{v}_s^t|}{|\mathbf{v}_l^t|}$) for $k=1$, $k=5$, and $k=50$. Larger $w$ indicates less flat Gaussians, while smaller $w$ represents flatter Gaussians. As $k$ increases, $\gamma^k$ decreases more steeply as $w$ rises. For $k=5$, we observe a balanced behavior: $\gamma^k$ approaches 1 for low $w$ and 0 for high $w$, providing a nuanced penalty adjustment across different Gaussian shapes.
  • Figure 4: Qualitative comparison on the NeRF-DS yan2023nerf dataset.
  • Figure 5: Qualitative comparison on the HyperNeRF park2021hypernerf dataset.
  • ...and 12 more figures