Table of Contents
Fetching ...

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

Maxim Nikolaev, Mikhail Kuznetsov, Dmitry Vetrov, Aibek Alanov

TL;DR

HairFastGAN addresses hair transfer under large pose differences by deploying a fast encoder-based pipeline that operates in StyleGAN's FS latent space and splits the problem into pose alignment, shape alignment, color alignment, and refinement alignment. A Rotate Encoder handles pose via pose-consistent rotation, FS/W+ mixing enables color editing in latent space, SEAN-based inpainting supports hair-shape transfer, and CLIP-guided color editing preserves identity with high fidelity. The Refinement alignment stage further recovers lost details at high resolution, yielding near real-time performance (e.g., <1s on Nvidia V100) with competitive realism metrics compared to optimization-based baselines. Ablation studies confirm the necessity of each module, and failure analyses outline limitations in long textures and extreme lighting, with plans to extend color-editing and shape-control capabilities.

Abstract

Our paper addresses the complex task of transferring a hairstyle from a reference image to an input photo for virtual hair try-on. This task is challenging due to the need to adapt to various photo poses, the sensitivity of hairstyles, and the lack of objective metrics. The current state of the art hairstyle transfer methods use an optimization process for different parts of the approach, making them inexcusably slow. At the same time, faster encoder-based models are of very low quality because they either operate in StyleGAN's W+ space or use other low-dimensional image generators. Additionally, both approaches have a problem with hairstyle transfer when the source pose is very different from the target pose, because they either don't consider the pose at all or deal with it inefficiently. In our paper, we present the HairFast model, which uniquely solves these problems and achieves high resolution, near real-time performance, and superior reconstruction compared to optimization problem-based methods. Our solution includes a new architecture operating in the FS latent space of StyleGAN, an enhanced inpainting approach, and improved encoders for better alignment, color transfer, and a new encoder for post-processing. The effectiveness of our approach is demonstrated on realism metrics after random hairstyle transfer and reconstruction when the original hairstyle is transferred. In the most difficult scenario of transferring both shape and color of a hairstyle from different images, our method performs in less than a second on the Nvidia V100. Our code is available at https://github.com/AIRI-Institute/HairFastGAN.

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

TL;DR

HairFastGAN addresses hair transfer under large pose differences by deploying a fast encoder-based pipeline that operates in StyleGAN's FS latent space and splits the problem into pose alignment, shape alignment, color alignment, and refinement alignment. A Rotate Encoder handles pose via pose-consistent rotation, FS/W+ mixing enables color editing in latent space, SEAN-based inpainting supports hair-shape transfer, and CLIP-guided color editing preserves identity with high fidelity. The Refinement alignment stage further recovers lost details at high resolution, yielding near real-time performance (e.g., <1s on Nvidia V100) with competitive realism metrics compared to optimization-based baselines. Ablation studies confirm the necessity of each module, and failure analyses outline limitations in long textures and extreme lighting, with plans to extend color-editing and shape-control capabilities.

Abstract

Our paper addresses the complex task of transferring a hairstyle from a reference image to an input photo for virtual hair try-on. This task is challenging due to the need to adapt to various photo poses, the sensitivity of hairstyles, and the lack of objective metrics. The current state of the art hairstyle transfer methods use an optimization process for different parts of the approach, making them inexcusably slow. At the same time, faster encoder-based models are of very low quality because they either operate in StyleGAN's W+ space or use other low-dimensional image generators. Additionally, both approaches have a problem with hairstyle transfer when the source pose is very different from the target pose, because they either don't consider the pose at all or deal with it inefficiently. In our paper, we present the HairFast model, which uniquely solves these problems and achieves high resolution, near real-time performance, and superior reconstruction compared to optimization problem-based methods. Our solution includes a new architecture operating in the FS latent space of StyleGAN, an enhanced inpainting approach, and improved encoders for better alignment, color transfer, and a new encoder for post-processing. The effectiveness of our approach is demonstrated on realism metrics after random hairstyle transfer and reconstruction when the original hairstyle is transferred. In the most difficult scenario of transferring both shape and color of a hairstyle from different images, our method performs in less than a second on the Nvidia V100. Our code is available at https://github.com/AIRI-Institute/HairFastGAN.
Paper Structure (38 sections, 27 equations, 16 figures, 7 tables)

This paper contains 38 sections, 27 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach.Our method takes as input a photo of the face, desired shape and hair color and then performs the transfer of the selected attributes. You can also see a comparison of our model with the others in the right plot. We were able to achieve excellent image realism while working in near real time.
  • Figure 2: Overview of HairFast: the images first pass through the Pose alignment module, which generates a pose-aligned face mask with the desired hair shape. Then we transfer the desired hairstyle shape using Shape alignment and the desired hair color using Color alignment. In the last step, Refinement alignment returns the lost details of the original image where they are needed.
  • Figure 3: Detailed diagram of the units. (a) Mixing block mixes FS and W+ space representations to allow color editing (b) The Pose alignment module diagram generates a pose-aligned mask with the desired hair shape, and the Shape alignment module diagram that takes the images themselves, their segmentation masks, $W+$ and $F$ representations to transfer the desired hairstyle shape.
  • Figure 4: Detailed diagram of the units. (a) A color alignment module diagram that takes as input $S$ image representations as well as segmentation masks. The purpose of this block is to encode the details of the original image and change the $S$ space to transfer the desired hair color and preserve the identity. (b) A refinement alignment diagram that takes as input the source image and post Color alignment module image. At this module, the goal is to get a new representation in StyleGAN space to get a realistic image, with the original details of the source image that were lost after inverting images into latents.
  • Figure 5: Visual comparison of methods on different cases for transferring hair and color together, or separately. StyleYourHair transfers color only from the Shape image. According to the results of visual comparison, our model better preserves the identity of the source image. At the same time, our method on most cases better transfers the desired hair color and texture, and works better with complex pose differences. For a more detailed comparison, see Appendix \ref{['sec:visual_comp']}.
  • ...and 11 more figures