Table of Contents
Fetching ...

Gaussian Splatting Lucas-Kanade

Liuyue Xie, Joel Julin, Koichiro Niinuma, Laszlo A. Jeni

TL;DR

This work tackles dynamic 3D reconstruction from monocular video with limited camera motion by replacing data-driven warp priors with an analytical scene-flow regularization for Gaussian Splatting. It derives an instantaneous velocity field from the forward warp field via $v(oldsymbol{G}; t) = J^{+}_{oldsymbol{G}} rac{ ullmid oldsymbol{ ext{W}_{c\rightarrow}(oldsymbol{G}; t)}_{ ext{}}}{ ext{d} t}$ and integrates it in time (using a Runge–Kutta solver) to obtain coherent Gaussian trajectories, enforcing motion and depth consistency through a loss $ ext{L}_{motion} = ext{L}_{flow} + ext{L}_{depth}$ with weights $oldsymbol{ humb{ rac{`,}}}$ and $oldsymbol{m}$-ranking depths. The approach yields improved geometric fidelity and motion separation on synthetic and real dynamic scenes, outperforming prior Gaussian Splatting variants and approaching NeRF-like quality under challenging camera motion. By delivering continuous-time warp-field regularization and reducing bias from purely data-driven priors, the method offers scalable, accurate dynamic scene reconstruction with minimal camera movement and opens avenues for 3D tracking and robust in-the-wild rendering. The results demonstrate the value of integrating analytical scene flow into deformable Gaussians for dynamic 3D vision tasks.

Abstract

Gaussian Splatting and its dynamic extensions are effective for reconstructing 3D scenes from 2D images when there is significant camera movement to facilitate motion parallax and when scene objects remain relatively static. However, in many real-world scenarios, these conditions are not met. As a consequence, data-driven semantic and geometric priors have been favored as regularizers, despite their bias toward training data and their neglect of broader movement dynamics. Departing from this practice, we propose a novel analytical approach that adapts the classical Lucas-Kanade method to dynamic Gaussian splatting. By leveraging the intrinsic properties of the forward warp field network, we derive an analytical velocity field that, through time integration, facilitates accurate scene flow computation. This enables the precise enforcement of motion constraints on warp fields, thus constraining both 2D motion and 3D positions of the Gaussians. Our method excels in reconstructing highly dynamic scenes with minimal camera movement, as demonstrated through experiments on both synthetic and real-world scenes.

Gaussian Splatting Lucas-Kanade

TL;DR

This work tackles dynamic 3D reconstruction from monocular video with limited camera motion by replacing data-driven warp priors with an analytical scene-flow regularization for Gaussian Splatting. It derives an instantaneous velocity field from the forward warp field via and integrates it in time (using a Runge–Kutta solver) to obtain coherent Gaussian trajectories, enforcing motion and depth consistency through a loss with weights and -ranking depths. The approach yields improved geometric fidelity and motion separation on synthetic and real dynamic scenes, outperforming prior Gaussian Splatting variants and approaching NeRF-like quality under challenging camera motion. By delivering continuous-time warp-field regularization and reducing bias from purely data-driven priors, the method offers scalable, accurate dynamic scene reconstruction with minimal camera movement and opens avenues for 3D tracking and robust in-the-wild rendering. The results demonstrate the value of integrating analytical scene flow into deformable Gaussians for dynamic 3D vision tasks.

Abstract

Gaussian Splatting and its dynamic extensions are effective for reconstructing 3D scenes from 2D images when there is significant camera movement to facilitate motion parallax and when scene objects remain relatively static. However, in many real-world scenarios, these conditions are not met. As a consequence, data-driven semantic and geometric priors have been favored as regularizers, despite their bias toward training data and their neglect of broader movement dynamics. Departing from this practice, we propose a novel analytical approach that adapts the classical Lucas-Kanade method to dynamic Gaussian splatting. By leveraging the intrinsic properties of the forward warp field network, we derive an analytical velocity field that, through time integration, facilitates accurate scene flow computation. This enables the precise enforcement of motion constraints on warp fields, thus constraining both 2D motion and 3D positions of the Gaussians. Our method excels in reconstructing highly dynamic scenes with minimal camera movement, as demonstrated through experiments on both synthetic and real-world scenes.
Paper Structure (15 sections, 16 equations, 12 figures, 6 tables)

This paper contains 15 sections, 16 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Data-driven depth and optical flow supervisions produce inaccurate geometries. Instead, we derive the analytical warp field of Gaussians to refine geometries and motions.
  • Figure 2: Analytical scene flow from warp field. With canonical Gaussians $\mathcal{G}_c$, we transform them forward in time to $\mathcal{G}_{t}$, then perform time integration from warp field velocities $v(\mathcal{G};t)$ to derive $\mathcal{G}_{t+\Delta t}$. The Gaussian offsets $\mathcal{G}_{t+\Delta t} - \mathcal{G}_{t}$ are compared to reference scene flow.
  • Figure 3: (a) Visualization of Gaussians' travel distance $||\boldsymbol{\mu}-\boldsymbol{\mu}_{o}||_2$. In both of the scenes, the humans are stationary with dinosaur balloons moving around. Our result correctly identifies the dynamic regions, whereas the baseline model forms motions in the background and on supposedly stationary humans to compensate for photometric correctness. (b) 3D visualization of the motion trajectories. Our result shows clean trajectories from the waving balloon.
  • Figure 4: Qualitative comparisons on the Dynamic Scenes dataset. Compared to the baseline method, our approach can achieve superior rendering quality on real datasets with lower EMFs.
  • Figure 5: Our method struggles with insufficient point cloud initialization due to reliance on non-rigid warping of scene geometry. In the "skating" scene, this results in unstable geometry at certain angles. Inaccurate camera calibrations also degrade our method, as shown in the "toby-sit" sequence, where miscalibration distorts the scene geometry.
  • ...and 7 more figures