Table of Contents
Fetching ...

MANUS: Markerless Grasp Capture using Articulated 3D Gaussians

Chandradeep Pokhariya, Ishaan N Shah, Angela Xing, Zekun Li, Kefan Chen, Avinash Sharma, Srinath Sridhar

TL;DR

A novel articulated 3D Gaussians representation that extends 3D Gaussian splatting for high-fidelity representation of articulating hands and also shows that the method outper-forms others on a quantitative contact evaluation method that uses paint transfer from the object to the hand.

Abstract

Understanding how we grasp objects with our hands has important applications in areas like robotics and mixed reality. However, this challenging problem requires accurate modeling of the contact between hands and objects. To capture grasps, existing methods use skeletons, meshes, or parametric models that does not represent hand shape accurately resulting in inaccurate contacts. We present MANUS, a method for Markerless Hand-Object Grasp Capture using Articulated 3D Gaussians. We build a novel articulated 3D Gaussians representation that extends 3D Gaussian splatting for high-fidelity representation of articulating hands. Since our representation uses Gaussian primitives, it enables us to efficiently and accurately estimate contacts between the hand and the object. For the most accurate results, our method requires tens of camera views that current datasets do not provide. We therefore build MANUS-Grasps, a new dataset that contains hand-object grasps viewed from 50+ cameras across 30+ scenes, 3 subjects, and comprising over 7M frames. In addition to extensive qualitative results, we also show that our method outperforms others on a quantitative contact evaluation method that uses paint transfer from the object to the hand.

MANUS: Markerless Grasp Capture using Articulated 3D Gaussians

TL;DR

A novel articulated 3D Gaussians representation that extends 3D Gaussian splatting for high-fidelity representation of articulating hands and also shows that the method outper-forms others on a quantitative contact evaluation method that uses paint transfer from the object to the hand.

Abstract

Understanding how we grasp objects with our hands has important applications in areas like robotics and mixed reality. However, this challenging problem requires accurate modeling of the contact between hands and objects. To capture grasps, existing methods use skeletons, meshes, or parametric models that does not represent hand shape accurately resulting in inaccurate contacts. We present MANUS, a method for Markerless Hand-Object Grasp Capture using Articulated 3D Gaussians. We build a novel articulated 3D Gaussians representation that extends 3D Gaussian splatting for high-fidelity representation of articulating hands. Since our representation uses Gaussian primitives, it enables us to efficiently and accurately estimate contacts between the hand and the object. For the most accurate results, our method requires tens of camera views that current datasets do not provide. We therefore build MANUS-Grasps, a new dataset that contains hand-object grasps viewed from 50+ cameras across 30+ scenes, 3 subjects, and comprising over 7M frames. In addition to extensive qualitative results, we also show that our method outperforms others on a quantitative contact evaluation method that uses paint transfer from the object to the hand.
Paper Structure (17 sections, 7 equations, 13 figures, 6 tables)

This paper contains 17 sections, 7 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: We introduce MANUS, a novel markerless approach for capturing grasps by employing an articulated 3D Gaussian representation to accurately model hand shapes. This approach improves contact estimation accuracy in comparison to other template-based approaches when evaluated against ground truth contacts.
  • Figure 2: MANUS-Hand is a template-free, articulable hand model learned from multi-view hand sequences which utilizes 3D Gaussian splatting representation for accurate modelling of the shape and appearance of hands.
  • Figure 3: MANUS leverages a driving pose to get MANUS-Hand in grasp scene. It is combined with an object model to get instantaneous and accumulated contacts between the two.
  • Figure 4: Qualitative comparison of MANUS-Hand with LiveHand mundra2023livehand and TAVA li2022tava. It's noteworthy that our renderings closely resemble those of LiveHand and surpass TAVA in quality, even in the absence of any components designed to enhance photorealism.
  • Figure 5: Here we show our contact estimation results on novel views for a variety of objects. We show both instantaneous and accumulated contacts for the hand in a canonical pose. Best viewed zoomed.
  • ...and 8 more figures