Table of Contents
Fetching ...

FreeGaussian: Annotation-free Control of Articulated Objects via 3D Gaussian Splats with Flow Derivatives

Qizhi Chen, Delin Qu, Junli Liu, Yiwen Tang, Haoming Song, Dong Wang, Bin Zhao, Xuelong Li

TL;DR

FreeGaussian presents an annotation-free pipeline for controllable view synthesis of articulated objects by deriving dynamic Gaussian flow from optical flow and camera motion, eliminating the need for manual masks or control signals. A key innovation is the 3D spherical vector control, which encodes per-Gaussian trajectories as state representations, enabling interactive manipulation without explicit trajectory fitting. The method combines flow-guided optimization with a robust clustering step (HDBSCAN) to localize interactive Gaussians, achieving state-of-the-art or competitive results on multiple datasets while maintaining real-time rendering potential. Overall, the work advances practical, annotation-free dynamic scene reconstruction with precise part-aware controllability and efficient training. Its flow-derivative framework and 3D spherical control offer a scalable path toward annotation-free CVS in real-world environments.

Abstract

Reconstructing controllable Gaussian splats for articulated objects from monocular video is especially challenging due to its inherently insufficient constraints. Existing methods address this by relying on dense masks and manually defined control signals, limiting their real-world applications. In this paper, we propose an annotation-free method, FreeGaussian, which mathematically disentangles camera egomotion and articulated movements via flow derivatives. By establishing a connection between 2D flows and 3D Gaussian dynamic flow, our method enables optimization and continuity of dynamic Gaussian motions from flow priors without any control signals. Furthermore, we introduce a 3D spherical vector controlling scheme, which represents the state as a 3D Gaussian trajectory, thereby eliminating the need for complex 1D control signal calculations and simplifying controllable Gaussian modeling. Extensive experiments on articulated objects demonstrate the state-of-the-art visual performance and precise, part-aware controllability of our method. Code is available at: https://github.com/Tavish9/freegaussian.

FreeGaussian: Annotation-free Control of Articulated Objects via 3D Gaussian Splats with Flow Derivatives

TL;DR

FreeGaussian presents an annotation-free pipeline for controllable view synthesis of articulated objects by deriving dynamic Gaussian flow from optical flow and camera motion, eliminating the need for manual masks or control signals. A key innovation is the 3D spherical vector control, which encodes per-Gaussian trajectories as state representations, enabling interactive manipulation without explicit trajectory fitting. The method combines flow-guided optimization with a robust clustering step (HDBSCAN) to localize interactive Gaussians, achieving state-of-the-art or competitive results on multiple datasets while maintaining real-time rendering potential. Overall, the work advances practical, annotation-free dynamic scene reconstruction with precise part-aware controllability and efficient training. Its flow-derivative framework and 3D spherical control offer a scalable path toward annotation-free CVS in real-world environments.

Abstract

Reconstructing controllable Gaussian splats for articulated objects from monocular video is especially challenging due to its inherently insufficient constraints. Existing methods address this by relying on dense masks and manually defined control signals, limiting their real-world applications. In this paper, we propose an annotation-free method, FreeGaussian, which mathematically disentangles camera egomotion and articulated movements via flow derivatives. By establishing a connection between 2D flows and 3D Gaussian dynamic flow, our method enables optimization and continuity of dynamic Gaussian motions from flow priors without any control signals. Furthermore, we introduce a 3D spherical vector controlling scheme, which represents the state as a 3D Gaussian trajectory, thereby eliminating the need for complex 1D control signal calculations and simplifying controllable Gaussian modeling. Extensive experiments on articulated objects demonstrate the state-of-the-art visual performance and precise, part-aware controllability of our method. Code is available at: https://github.com/Tavish9/freegaussian.

Paper Structure

This paper contains 23 sections, 15 equations, 16 figures, 7 tables, 1 algorithm.

Figures (16)

  • Figure 1: The overview of FreeGaussian. Given a set of video stream $\{\mathbf{P}(t), \mathbf{I}(t)\}$, our method recovers controllable 3D Gaussians $\mathbf{G}^{\ast}$ with two stages. First, we pre-train a deformable 3DGS and calculate dynamic Gaussian flow $\mathbf{u}^\text{GS}$ via \ref{['eq:gaussian_flow_analysis']}. Then, we reproject dynamic Gaussian flow maps and cluster the active Gaussians with HDBSCAN algorithm, followed by trajectory calculation. In the controllable training stage, we optimize Gaussians $\mathbf{G}$ and network $\mathbf{\Theta}$ under the rasterisation loss in \ref{['eq:loss']}, which jointly aligns rendered images with input views and enforces consistency in the predicted dynamic flows.
  • Figure 2: Dynamic Gaussian flow illustration. In interactive scenes, consider an instantaneous motion model, where the camera and 3D Gaussian hold separate velocities in consecutive frames. The projected optical flow $\mathbf{u}$ can be decomposed into camera flow $\mathbf{u}^{\text{Cam}}$ and dynamic Gaussian flow $\mathbf{u}^{\text{GS}}$, as described in \ref{['eq:gaussian_flow_analysis', 'eq:dynamic_gs_flow']}.
  • Figure 3: Illustration of dynamic Gaussian flow map under static and dynamic scenes. a) In static scenes with solely camera motion, \ref{['eq:dynamic_gs_flow']} degenerate to pure camera flow, yielding zero dynamic Gaussian flow. b) In contract, when articulated object moves, the dynamic Gaussian flow map will highlight interactive 3D Gaussians.
  • Figure 4: View Synthesis Visualization on CoNeRF Dataset. In comparison with other methods, FreeGaussian achieves more realistic and detailed rendering quality, whereas other methods suffer from ghosting artifacts.
  • Figure 5: Flow Decoupling Comparison. FreeGaussian (row 2) cleanly separates camera egomotion from the microwave’s self-motion, producing artifact-free dynamic Gaussian flow.
  • ...and 11 more figures