HandDiffuse: Generative Controllers for Two-Hand Interactions via Diffusion Models
Pei Lin, Sihang Xu, Hongdi Yang, Yiran Liu, Xin Chen, Jingya Wang, Jingyi Yu, Lan Xu
TL;DR
This work tackles the scarcity of temporally rich, strongly interacting two-hand motion data by introducing HandDiffuse12.5M and a diffusion-based baseline HandDiffuse. The method employs two denoisers (Single Hand Denoiser and Interacting Hands Denoiser) and two motion representations (local and global), augmented by an Interaction Loss to model dynamic hand contact. Empirical results show HandDiffuse surpasses state-of-the-art methods in quality and diversity, and the dataset enables applications such as in-betweening, trajectory-conditioned generation, and data augmentation for other datasets. The dataset and models are released to spur further research in two-hand interaction modeling for AR/VR, robotics, and avatars.
Abstract
Existing hands datasets are largely short-range and the interaction is weak due to the self-occlusion and self-similarity of hands, which can not yet fit the need for interacting hands motion generation. To rescue the data scarcity, we propose HandDiffuse12.5M, a novel dataset that consists of temporal sequences with strong two-hand interactions. HandDiffuse12.5M has the largest scale and richest interactions among the existing two-hand datasets. We further present a strong baseline method HandDiffuse for the controllable motion generation of interacting hands using various controllers. Specifically, we apply the diffusion model as the backbone and design two motion representations for different controllers. To reduce artifacts, we also propose Interaction Loss which explicitly quantifies the dynamic interaction process. Our HandDiffuse enables various applications with vivid two-hand interactions, i.e., motion in-betweening and trajectory control. Experiments show that our method outperforms the state-of-the-art techniques in motion generation and can also contribute to data augmentation for other datasets. Our dataset, corresponding codes, and pre-trained models will be disseminated to the community for future research towards two-hand interaction modeling.
