Table of Contents
Fetching ...

RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images

Benzhi Wang, Jingkai Zhou, Jingqi Bai, Yang Yang, Weihua Chen, Fan Wang, Zhen Lei

TL;DR

The proposed RealisHuman framework significantly enhances the realism of human generation, as demonstrated by notable improvements in both qualitative and quantitative metrics.

Abstract

In recent years, diffusion models have revolutionized visual generation, outperforming traditional frameworks like Generative Adversarial Networks (GANs). However, generating images of humans with realistic semantic parts, such as hands and faces, remains a significant challenge due to their intricate structural complexity. To address this issue, we propose a novel post-processing solution named RealisHuman. The RealisHuman framework operates in two stages. First, it generates realistic human parts, such as hands or faces, using the original malformed parts as references, ensuring consistent details with the original image. Second, it seamlessly integrates the rectified human parts back into their corresponding positions by repainting the surrounding areas to ensure smooth and realistic blending. The RealisHuman framework significantly enhances the realism of human generation, as demonstrated by notable improvements in both qualitative and quantitative metrics. Code is available at https://github.com/Wangbenzhi/RealisHuman.

RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images

TL;DR

The proposed RealisHuman framework significantly enhances the realism of human generation, as demonstrated by notable improvements in both qualitative and quantitative metrics.

Abstract

In recent years, diffusion models have revolutionized visual generation, outperforming traditional frameworks like Generative Adversarial Networks (GANs). However, generating images of humans with realistic semantic parts, such as hands and faces, remains a significant challenge due to their intricate structural complexity. To address this issue, we propose a novel post-processing solution named RealisHuman. The RealisHuman framework operates in two stages. First, it generates realistic human parts, such as hands or faces, using the original malformed parts as references, ensuring consistent details with the original image. Second, it seamlessly integrates the rectified human parts back into their corresponding positions by repainting the surrounding areas to ensure smooth and realistic blending. The RealisHuman framework significantly enhances the realism of human generation, as demonstrated by notable improvements in both qualitative and quantitative metrics. Code is available at https://github.com/Wangbenzhi/RealisHuman.
Paper Structure (12 sections, 7 equations, 7 figures, 1 table)

This paper contains 12 sections, 7 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Illustration of our repair results. Each pair consists of two images: the left image is the original, and the right image is the repair result.
  • Figure 2: Details of our RealisHuman. Our method separates the task of refining malformed human parts into two distinct stages. In the first stage, we focus on generating realistic human parts using the Part Detail Encoder. Given an image containing malformed human parts, we begin by locating and cropping the target regions. Subsequently, we filter the background of the target regions, creating a reference image that provides essential part details, such as skin tone. We also estimate the 3D structure of the human parts to serve as pose guidance. Leveraging both the reference images and the part structures, we generate realistic human parts $r_{part}$ with accurate structures and detailed information. In the second stage, our goal is to seamlessly integrate the refined human parts into the corresponding regions of the original image, resulting in the refined image $I^{'}$. To avoid a cut-and-paste appearance, we repaint the area between the background and the rectified human parts, ensuring a seamless integration and a more natural overall appearance.
  • Figure 3: Comparison of hand refinement results. Each set of images displays, from left to right, the original image, our method's repair result, and the HandRefiner method's repair result.
  • Figure 4: Comparison of face refinement results. The first row shows the original images, and the second row shows the images after face refinement.
  • Figure 5: Comparison of directly pasting the rectified human parts versus our method. The first row shows the results of direct pasting, which exhibit visible artifacts. The second row demonstrates the effectiveness of our method in achieving seamless integration.
  • ...and 2 more figures