SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Patrick Rim; Kevin Harris; Braden Copple; Shangchen Han; Xu Xie; Ivan Shugurov; Sizhe An; He Wen; Alex Wong; Tomas Hodan; Kun He

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Patrick Rim, Kevin Harris, Braden Copple, Shangchen Han, Xu Xie, Ivan Shugurov, Sizhe An, He Wen, Alex Wong, Tomas Hodan, Kun He

Abstract

Accurate 3D understanding of human hands and objects during manipulation remains a significant challenge for egocentric computer vision. Existing hand-object interaction datasets are predominantly captured in controlled studio settings, which limits both environmental diversity and the ability of models trained on such data to generalize to real-world scenarios. To address this challenge, we introduce a novel marker-less multi-camera system that allows for nearly unconstrained mobility in genuinely in-the-wild conditions, while still having the ability to generate precise 3D annotations of hands and objects. The capture system consists of a lightweight, back-mounted, multi-camera rig that is synchronized and calibrated with a user-worn VR headset. For 3D ground-truth annotation of hands and objects, we develop an ego-exo tracking pipeline and rigorously evaluate its quality. Finally, we present SHOW3D, the first large-scale dataset with 3D annotations that show hands interacting with objects in diverse real-world environments, including outdoor settings. Our approach significantly reduces the fundamental trade-off between environmental realism and accuracy of 3D annotations, which we validate with experiments on several downstream tasks. show3d-dataset.github.io

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Abstract

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Abstract

Paper Structure

Table of Contents

Figures (16)