Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction
Edgar Sucar, Zihang Lai, Eldar Insafutdinov, Andrea Vedaldi
TL;DR
Dynamic Point Maps (DPM) extend DUSt3R by introducing time as an additional reference dimension, yielding invariance to both viewpoint and scene motion. The method predicts, for each image, two time-stamped point maps (one per timestamp) expressed in the first camera's frame, enabling immediate 4D reductions such as scene flow, motion segmentation, and object tracking within a single network. Trained on a mix of synthetic and real data across seven datasets, DPM demonstrates state-of-the-art or competitive performance in depth prediction, dynamic reconstruction, and scene/object flow, while maintaining a compact, end-to-end architecture. This work lays the groundwork for dynamic 3D foundation models by providing a unified, scalable representation that handles both spatial and temporal variations and simplifies downstream 4D reasoning.
Abstract
DUSt3R has recently shown that one can reduce many tasks in multi-view geometry, including estimating camera intrinsics and extrinsics, reconstructing the scene in 3D, and establishing image correspondences, to the prediction of a pair of viewpoint-invariant point maps, i.e., pixel-aligned point clouds defined in a common reference frame. This formulation is elegant and powerful, but unable to tackle dynamic scenes. To address this challenge, we introduce the concept of Dynamic Point Maps (DPM), extending standard point maps to support 4D tasks such as motion segmentation, scene flow estimation, 3D object tracking, and 2D correspondence. Our key intuition is that, when time is introduced, there are several possible spatial and time references that can be used to define the point maps. We identify a minimal subset of such combinations that can be regressed by a network to solve the sub tasks mentioned above. We train a DPM predictor on a mixture of synthetic and real data and evaluate it across diverse benchmarks for video depth prediction, dynamic point cloud reconstruction, 3D scene flow and object pose tracking, achieving state-of-the-art performance. Code, models and additional results are available at https://www.robots.ox.ac.uk/~vgg/research/dynamic-point-maps/.
