Table of Contents
Fetching ...

ArtGS: Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

Yu Liu, Baoxiong Jia, Ruijie Lu, Junfeng Ni, Song-Chun Zhu, Siyuan Huang

TL;DR

ArtGS addresses the challenge of reconstructing and animating complex, multi-part articulated objects from two-state multi-view data by introducing a Gaussian-splat-based representation with a canonical, motion-informed initialization and a skinning-inspired, center-based part dynamics model. The method jointly optimizes canonical Gaussians, per-part motion via dual-quaternions, and part assignments, guided by self-supervision and warm-up strategies to ensure robust convergence. Empirical results on synthetic and real datasets show state-of-the-art performance in both joint parameter estimation and high-quality part-mesh reconstruction, with strong scalability to objects having multiple movable parts. The approach significantly improves reconstruction quality and efficiency and lays groundwork for reliable digital twins in robotics and AR, while also outlining limitations and avenues for future work such as multi-state extensions and improved center initialization.

Abstract

Building articulated objects is a key challenge in computer vision. Existing methods often fail to effectively integrate information across different object states, limiting the accuracy of part-mesh reconstruction and part dynamics modeling, particularly for complex multi-part articulated objects. We introduce ArtGS, a novel approach that leverages 3D Gaussians as a flexible and efficient representation to address these issues. Our method incorporates canonical Gaussians with coarse-to-fine initialization and updates for aligning articulated part information across different object states, and employs a skinning-inspired part dynamics modeling module to improve both part-mesh reconstruction and articulation learning. Extensive experiments on both synthetic and real-world datasets, including a new benchmark for complex multi-part objects, demonstrate that ArtGS achieves state-of-the-art performance in joint parameter estimation and part mesh reconstruction. Our approach significantly improves reconstruction quality and efficiency, especially for multi-part articulated objects. Additionally, we provide comprehensive analyses of our design choices, validating the effectiveness of each component to highlight potential areas for future improvement. Our work is made publicly available at: https://articulate-gs.github.io.

ArtGS: Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

TL;DR

ArtGS addresses the challenge of reconstructing and animating complex, multi-part articulated objects from two-state multi-view data by introducing a Gaussian-splat-based representation with a canonical, motion-informed initialization and a skinning-inspired, center-based part dynamics model. The method jointly optimizes canonical Gaussians, per-part motion via dual-quaternions, and part assignments, guided by self-supervision and warm-up strategies to ensure robust convergence. Empirical results on synthetic and real datasets show state-of-the-art performance in both joint parameter estimation and high-quality part-mesh reconstruction, with strong scalability to objects having multiple movable parts. The approach significantly improves reconstruction quality and efficiency and lays groundwork for reliable digital twins in robotics and AR, while also outlining limitations and avenues for future work such as multi-state extensions and improved center initialization.

Abstract

Building articulated objects is a key challenge in computer vision. Existing methods often fail to effectively integrate information across different object states, limiting the accuracy of part-mesh reconstruction and part dynamics modeling, particularly for complex multi-part articulated objects. We introduce ArtGS, a novel approach that leverages 3D Gaussians as a flexible and efficient representation to address these issues. Our method incorporates canonical Gaussians with coarse-to-fine initialization and updates for aligning articulated part information across different object states, and employs a skinning-inspired part dynamics modeling module to improve both part-mesh reconstruction and articulation learning. Extensive experiments on both synthetic and real-world datasets, including a new benchmark for complex multi-part objects, demonstrate that ArtGS achieves state-of-the-art performance in joint parameter estimation and part mesh reconstruction. Our approach significantly improves reconstruction quality and efficiency, especially for multi-part articulated objects. Additionally, we provide comprehensive analyses of our design choices, validating the effectiveness of each component to highlight potential areas for future improvement. Our work is made publicly available at: https://articulate-gs.github.io.

Paper Structure

This paper contains 52 sections, 13 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: The overview of ArtGS. Our method is divided into two stages: (i) obtaining coarse canonical Gaussians $\mathcal{G}^c_{\text{init}}$ by matching the Gaussians $\mathcal{G}^0_{\text{single}}$ and $\mathcal{G}^1_{\text{single}}$ trained with each single-state individually and initializing the part assignment module with clustered centers, (ii) jointly optimizing canonical Gaussians $\mathcal{G}^c$ and articulation model (including the articulation parameters $\Psi$ and the part assignment module in \ref{['sec:method:skinning']}).
  • Figure 2: Qualitative visualizations of PARIS objects. We present reconstruction comparisons between DTA and our model on Real Storage (Top) and Synthetic Blade (Bottom). DTA struggles with mesh reconstruction at the low-visibility state, as it processes each state separately. In contrast, our method leverages the connection between states to improve the reconstruction for both low- and high-visibility states.
  • Figure 3: Qualitative results on multi-part objects. We present reconstruction comparisons between DTA and our model on Storage-47648 (Left) and Table-31249 (Bottom). On ArtGS-Multi, DTA struggles with movable part identification and axis prediction as the number of parts increases, whereas our model maintains high performance regardless of part count, achieving high-quality reconstruction of part mesh and joint articulation.
  • Figure 4: Abaltion Studies. We visualize the initialized and optimized canonical Gaussians with their part assignment and centers for the full model, w/o Motion Prior and w/o Cano. Init. We highlight center error, part assignment error, and canonical Gaussian error with red, green, and blue bounding boxes separately.
  • Figure A.1: Failure cases. We illustrate failure cases of our ArtGS. 'Init./Opt. Cano.' represents initialized and optimized Canonical Gaussians, while the prefix 'M' indicates manual correction of erroneous part centers.
  • ...and 3 more figures