HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach
Maxim Nikolaev, Mikhail Kuznetsov, Dmitry Vetrov, Aibek Alanov
TL;DR
HairFastGAN addresses hair transfer under large pose differences by deploying a fast encoder-based pipeline that operates in StyleGAN's FS latent space and splits the problem into pose alignment, shape alignment, color alignment, and refinement alignment. A Rotate Encoder handles pose via pose-consistent rotation, FS/W+ mixing enables color editing in latent space, SEAN-based inpainting supports hair-shape transfer, and CLIP-guided color editing preserves identity with high fidelity. The Refinement alignment stage further recovers lost details at high resolution, yielding near real-time performance (e.g., <1s on Nvidia V100) with competitive realism metrics compared to optimization-based baselines. Ablation studies confirm the necessity of each module, and failure analyses outline limitations in long textures and extreme lighting, with plans to extend color-editing and shape-control capabilities.
Abstract
Our paper addresses the complex task of transferring a hairstyle from a reference image to an input photo for virtual hair try-on. This task is challenging due to the need to adapt to various photo poses, the sensitivity of hairstyles, and the lack of objective metrics. The current state of the art hairstyle transfer methods use an optimization process for different parts of the approach, making them inexcusably slow. At the same time, faster encoder-based models are of very low quality because they either operate in StyleGAN's W+ space or use other low-dimensional image generators. Additionally, both approaches have a problem with hairstyle transfer when the source pose is very different from the target pose, because they either don't consider the pose at all or deal with it inefficiently. In our paper, we present the HairFast model, which uniquely solves these problems and achieves high resolution, near real-time performance, and superior reconstruction compared to optimization problem-based methods. Our solution includes a new architecture operating in the FS latent space of StyleGAN, an enhanced inpainting approach, and improved encoders for better alignment, color transfer, and a new encoder for post-processing. The effectiveness of our approach is demonstrated on realism metrics after random hairstyle transfer and reconstruction when the original hairstyle is transferred. In the most difficult scenario of transferring both shape and color of a hairstyle from different images, our method performs in less than a second on the Nvidia V100. Our code is available at https://github.com/AIRI-Institute/HairFastGAN.
