PersonificationNet: Making customized subject act like a person
Tianchu Guo, Pengyu Li, Biao Wang, Xiansheng Hua
TL;DR
This work tackles the challenge of rendering a customized subject in the exact pose and background of a reference image. It introduces PersonificationNet, a three-component framework comprising a Customized Branch for appearance capture, a finetuned Pose Condition Branch for pose transfer, and a Structure Alignment Module to reconcile body proportions with the reference pose during inference. The customized branch is trained on 3–5 user-provided images with a rare-token identity, the pose branch is finetuned on 55 subject-specific images, and the structure alignment step ensures the subject's proportions match while adopting the reference pose; together they outperform Dreambooth and Dreambooth+ControlNet on two target subjects. The approach enables faithful pose and background transfer for customized subjects, enabling more controllable and personalized diffusion-based image synthesis in practical applications.
Abstract
Recently customized generation has significant potential, which uses as few as 3-5 user-provided images to train a model to synthesize new images of a specified subject. Though subsequent applications enhance the flexibility and diversity of customized generation, fine-grained control over the given subject acting like the person's pose is still lack of study. In this paper, we propose a PersonificationNet, which can control the specified subject such as a cartoon character or plush toy to act the same pose as a given referenced person's image. It contains a customized branch, a pose condition branch and a structure alignment module. Specifically, first, the customized branch mimics specified subject appearance. Second, the pose condition branch transfers the body structure information from the human to variant instances. Last, the structure alignment module bridges the structure gap between human and specified subject in the inference stage. Experimental results show our proposed PersonificationNet outperforms the state-of-the-art methods.
