RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance
Chengrui Wang, Pengfei Liu, Min Zhou, Ming Zeng, Xubin Li, Tiezheng Ge, Bo zheng
TL;DR
RHanDS addresses the instability of hand structures in diffusion-generated images by decoupling style and structure guidance into a two-stage framework. A VAE and a conditional U-Net are guided respectively by a style encoder (CLIP-based) and a structure encoder (depth from a reconstructed hand mesh), enabling precise region-focused repainting of malformed hands. The authors introduce three multi-style datasets to support separate learning of style and structure, and demonstrate through extensive experiments that RHanDS improves both structural accuracy (MPJPE) and style consistency (FID, Style Loss) across multiple styles, outperforming HandRefiner baselines. This approach offers a practical pathway to high-fidelity, style-consistent hand generation in diffusion models, with broad implications for content realism in hand-rich imagery.
Abstract
Although diffusion models can generate high-quality human images, their applications are limited by the instability in generating hands with correct structures. In this paper, we introduce RHanDS, a conditional diffusion-based framework designed to refine malformed hands by utilizing decoupled structure and style guidance. The hand mesh reconstructed from the malformed hand offers structure guidance for correcting the structure of the hand, while the malformed hand itself provides style guidance for preserving the style of the hand. To alleviate the mutual interference between style and structure guidance, we introduce a two-stage training strategy and build a series of multi-style hand datasets. In the first stage, we use paired hand images for training to ensure stylistic consistency in hand refining. In the second stage, various hand images generated based on human meshes are used for training, enabling the model to gain control over the hand structure. Experimental results demonstrate that RHanDS can effectively refine hand structure while preserving consistency in hand style.
