Human Body Restoration with One-Step Diffusion Model and A New Benchmark
Jue Gong, Jingkai Wang, Zheng Chen, Xing Liu, Hong Gu, Yulun Zhang, Xiaokang Yang
TL;DR
This work tackles the absence of benchmarks for human body restoration by introducing HQ-ACF, a pipeline that curates the PERSONA dataset of 109,052 HQ 512×512 human images across diverse natural activities. It further introduces OSDHuman, a one-step diffusion model guided by a high-fidelity image embedder (HFIE) and optimized with variational score distillation (VSD) to align outputs with natural image distributions. Empirical results show that OSDHuman achieves superior visual quality and quantitative metrics on both synthetic and real-world PERSONA data, outperforming several baseline diffusion methods while reducing inference costs. The combination of a robust dataset and a specialized one-step model provides a practical, scalable solution for high-quality human body restoration with broad applicability in imaging and related tasks.
Abstract
Human body restoration, as a specific application of image restoration, is widely applied in practice and plays a vital role across diverse fields. However, thorough research remains difficult, particularly due to the lack of benchmark datasets. In this study, we propose a high-quality dataset automated cropping and filtering (HQ-ACF) pipeline. This pipeline leverages existing object detection datasets and other unlabeled images to automatically crop and filter high-quality human images. Using this pipeline, we constructed a person-based restoration with sophisticated objects and natural activities (\emph{PERSONA}) dataset, which includes training, validation, and test sets. The dataset significantly surpasses other human-related datasets in both quality and content richness. Finally, we propose \emph{OSDHuman}, a novel one-step diffusion model for human body restoration. Specifically, we propose a high-fidelity image embedder (HFIE) as the prompt generator to better guide the model with low-quality human image information, effectively avoiding misleading prompts. Experimental results show that OSDHuman outperforms existing methods in both visual quality and quantitative metrics. The dataset and code will at https://github.com/gobunu/OSDHuman.
