Table of Contents
Fetching ...

HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

Yu Zeng, Yang Zhang, Jiachen Liu, Linlin Shen, Kaijun Deng, Weizhao He, Jinbao Wang

TL;DR

This work introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space, and fine-tuned a CLIP model using a multi-color hairstyle dataset.

Abstract

Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many existing methods are based on StyleGAN to address this task. However, due to the limited spatial distribution of StyleGAN, it struggles with multiple hair color editing and facial preservation. Considering the advancements in diffusion models, we utilize Latent Diffusion Models (LDMs) for hairstyle editing. Our approach introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space. Additionally, we train a warping module to align the hair color with the target region. To further enhance multi-color hairstyle editing, we fine-tuned a CLIP model using a multi-color hairstyle dataset. Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors during diffusion editing. Extensive experiments showcase the superiority of our method in editing multi-color hairstyles while preserving facial attributes given textual descriptions and reference images.

HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

TL;DR

This work introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space, and fine-tuned a CLIP model using a multi-color hairstyle dataset.

Abstract

Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many existing methods are based on StyleGAN to address this task. However, due to the limited spatial distribution of StyleGAN, it struggles with multiple hair color editing and facial preservation. Considering the advancements in diffusion models, we utilize Latent Diffusion Models (LDMs) for hairstyle editing. Our approach introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space. Additionally, we train a warping module to align the hair color with the target region. To further enhance multi-color hairstyle editing, we fine-tuned a CLIP model using a multi-color hairstyle dataset. Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors during diffusion editing. Extensive experiments showcase the superiority of our method in editing multi-color hairstyles while preserving facial attributes given textual descriptions and reference images.

Paper Structure

This paper contains 21 sections, 11 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: Our framework supports individual or collaborative editing of hairstyle and color, utilizing text, reference images, and stroke maps. With exceptional performance, particularly evident in editing multiple hair colors.
  • Figure 2: Overview of HairDiffusion: (a) Using a hairstyle description $T^\mathrm{s}$ or reference image $I^\mathrm{s}_\mathrm{r}$ as conditional input, coupled with the hair-agnostic mask $M_\mathrm{a}$ and source image $I_\mathrm{src}$, we can get the style proxy $\textcolor{rgb(120,32,110)}{P^s}$. (b) Leveraging the color proxy and style proxy, along with the hair-agnostic mask $M_\mathrm{c}$ and source image $I_\mathrm{src}$, enables individual or collaborative editing of hair color and hairstyle. (c) Given a series of conditions driven from the input image $I_\mathrm{c}$, the hair color reference image $I^\mathrm{c}_\mathrm{r}$ is used to obtain the color proxy $\textcolor{rgb(192,79,21)}{P^c}$ through a warping module. In the case of changing only the hairstyle while preserving the original hair color, $I^\mathrm{c}_\mathrm{r} = I_\mathrm{src}$. (d) The color proxy $P^c$ and the style proxy $P^s$ are blended at different stages of the diffusion process.
  • Figure 3: Visual comparison with HairCLIPv2 wei2023hairclipv2, HairCLIP wei2022hairclip, TediGAN xia2021tedigan, PowerPaint("ControlNet" version) zhuang2023powerpainttask, ControlNet-Inpainting zhang2023controllnet, and DiffCLIP kim2022diffusionclip. The simplified text descriptions (editing hairstyle, hair color, or both of them) are listed on the leftmost side. Our approach demonstrates better editing effects and irrelevant attribute preservation (e.g., identity, background).
  • Figure 4: Comparison with HairCLIPv2 wei2023hairclipv2 in detail. Our approach shows better preservation of irrelevant attributes.
  • Figure 5: Visual comparison with HairCLIPv2 wei2023hairclipv2, HairCLIP wei2022hairclip, Barbershop zhu2021barbershop, CtrlHair guo2022controlhairl, MichiGAN tan2020michigan and HairFastGAN nikolaev2024hairfastgan on hair color transfer.
  • ...and 10 more figures