Table of Contents
Fetching ...

Dynamic Scene Reconstruction: Recent Advance in Real-time Rendering and Streaming

Jiaxuan Zhu, Hao Tang

TL;DR

This survey provides a comprehensive overview of dynamic scene reconstruction and rendering from 2D imagery, emphasizing Neural Radiance Field (NeRF)–based and 3D Gaussian Splatting (3D-GS) approaches. It categorizes dynamic NeRFs into time-augmented, deformation-based, and hybrid representations, and reviews dynamic 3D-GS through deformation-field, 4D-primitive, and per-frame training strategies, highlighting efficiency and quality trade-offs. The article also covers volumetric video representations and streaming, detailing compression, rate-distortion optimization, and streaming pipelines, supported by extensive dataset and benchmark comparisons. By synthesizing 170+ papers, it identifies key challenges—data sparsity, temporal coherence, and scalability—and outlines practical future directions for real-time, wide-scale dynamic scene capture and transmission.

Abstract

Representing and rendering dynamic scenes from 2D images is a fundamental yet challenging problem in computer vision and graphics. This survey provides a comprehensive review of the evolution and advancements in dynamic scene representation and rendering, with a particular emphasis on recent progress in Neural Radiance Fields based and 3D Gaussian Splatting based reconstruction methods. We systematically summarize existing approaches, categorize them according to their core principles, compile relevant datasets, compare the performance of various methods on these benchmarks, and explore the challenges and future research directions in this rapidly evolving field. In total, we review over 170 relevant papers, offering a broad perspective on the state of the art in this domain.

Dynamic Scene Reconstruction: Recent Advance in Real-time Rendering and Streaming

TL;DR

This survey provides a comprehensive overview of dynamic scene reconstruction and rendering from 2D imagery, emphasizing Neural Radiance Field (NeRF)–based and 3D Gaussian Splatting (3D-GS) approaches. It categorizes dynamic NeRFs into time-augmented, deformation-based, and hybrid representations, and reviews dynamic 3D-GS through deformation-field, 4D-primitive, and per-frame training strategies, highlighting efficiency and quality trade-offs. The article also covers volumetric video representations and streaming, detailing compression, rate-distortion optimization, and streaming pipelines, supported by extensive dataset and benchmark comparisons. By synthesizing 170+ papers, it identifies key challenges—data sparsity, temporal coherence, and scalability—and outlines practical future directions for real-time, wide-scale dynamic scene capture and transmission.

Abstract

Representing and rendering dynamic scenes from 2D images is a fundamental yet challenging problem in computer vision and graphics. This survey provides a comprehensive review of the evolution and advancements in dynamic scene representation and rendering, with a particular emphasis on recent progress in Neural Radiance Fields based and 3D Gaussian Splatting based reconstruction methods. We systematically summarize existing approaches, categorize them according to their core principles, compile relevant datasets, compare the performance of various methods on these benchmarks, and explore the challenges and future research directions in this rapidly evolving field. In total, we review over 170 relevant papers, offering a broad perspective on the state of the art in this domain.

Paper Structure

This paper contains 26 sections, 11 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The structure of the survey.
  • Figure 2: General timeline of dynamic scene reconstruction methods. The bottom images are adapted from CompleteMultiviewReconstruction, suwajanakornTotalMovingFace2014, russellVideoPopupMonocular2014, newcombeDynamicFusionReconstructionTracking2015, lombardiNeuralVolumesLearning2019b, parkNerfiesDeformableNeural2021 and sun3DGStreamFlyTraining.
  • Figure 3: General Pipeline of dynamic NeRF methods. The figure is adapted from mildenhall2020nerfrepresentingscenesneural, pumarola2020d and Park_2021_ICCV.
  • Figure 4: Pipeline of HexPlane. The image is courtesy of caoHexPlaneFastRepresentation2023.
  • Figure 5: General Pipeline of dynamic 3D-GS methods. For deformable based methods, MLP as a deformation network is inserted into the 3D-GS pipeline to predict the deformation of 3D Gaussians across frames. 4D primitive based methods extend the 3D case by incorporating the time dimension, sampling 4D Gaussians into 3D Gaussians at certain timestamps and incorporating regularizations to ensure the temporal consistency. The figure is adapted from kerbl20233d and yangDeformable3DGaussians2023.
  • ...and 1 more figures