Table of Contents
Fetching ...

MegaPortrait: Revisiting Diffusion Control for High-fidelity Portrait Generation

Han Yang, Sotiris Anagnostidis, Enis Simsar, Thomas Hofmann

TL;DR

The proposed MegaPortrait is an innovative system for creating personalized portrait images in computer vision with three modules: Identity Net, Shading Net, and Harmonization Net, which is better than state-of-the-art AI portrait products in identity preservation and image fidelity.

Abstract

We propose MegaPortrait. It's an innovative system for creating personalized portrait images in computer vision. It has three modules: Identity Net, Shading Net, and Harmonization Net. Identity Net generates learned identity using a customized model fine-tuned with source images. Shading Net re-renders portraits using extracted representations. Harmonization Net fuses pasted faces and the reference image's body for coherent results. Our approach with off-the-shelf Controlnets is better than state-of-the-art AI portrait products in identity preservation and image fidelity. MegaPortrait has a simple but effective design and we compare it with other methods and products to show its superiority.

MegaPortrait: Revisiting Diffusion Control for High-fidelity Portrait Generation

TL;DR

The proposed MegaPortrait is an innovative system for creating personalized portrait images in computer vision with three modules: Identity Net, Shading Net, and Harmonization Net, which is better than state-of-the-art AI portrait products in identity preservation and image fidelity.

Abstract

We propose MegaPortrait. It's an innovative system for creating personalized portrait images in computer vision. It has three modules: Identity Net, Shading Net, and Harmonization Net. Identity Net generates learned identity using a customized model fine-tuned with source images. Shading Net re-renders portraits using extracted representations. Harmonization Net fuses pasted faces and the reference image's body for coherent results. Our approach with off-the-shelf Controlnets is better than state-of-the-art AI portrait products in identity preservation and image fidelity. MegaPortrait has a simple but effective design and we compare it with other methods and products to show its superiority.

Paper Structure

This paper contains 14 sections, 4 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Our novel approach, MegaPortrait, produces results of exceptional quality. Leveraging the person ID provided by the source image, our method adopts the reference image as style and pose reference. MegaPortrait generates high-fidelity results and seamlessly integrates the source individual's features with specified styles and poses extracted from reference image.
  • Figure 2: The overall pipeline of our MegaPortrait which consists of three modules: Identity Net, Shading Net and Harmonization Net. The Identity Net generates an image pair with a different identity-shading trade-off, and we extract the light maps with low-pass filters from the more-reference-like image and the HF-Maps from the more-source-like image. Then the Shading Net takes these two control conditions to synthesize the geometrically correct stylized image and paste back to the reference image for final harmonization, conducted by the Harmonization Net.
  • Figure 3: The cross extensive results of MegaPortrait with various source IDs and reference styles. The reference images are blue-boxed and the source images are red-boxed.
  • Figure 4: The visual comparison of MegaPortrait with the state-of-the-art portrait generation baseline methods, Fastcomposer FastComposer and IP-Adapter IP-Adapter, by using Audrey Hepburn as source ID. Our results can keep the style to the maximum while preserving the identity of the source image. Since Fastcomposer and IP-adapter only support ID-reference with prompting, we extract the prompts from reference image to test these methods. We also compare our results with the state-of-the-art AI photoshooting product Remini remini. The FaceSwap baseline adopts the swapper from insightface insightface.
  • Figure 5: The results of using IP-Adapter-FaceID with style and identity reference. IP-Adapter-FaceID can only handle with identity input and discards the style information. Meanwhile, the IP-Adapter IP-Adapter original version only supports style reference and lacks in ability to capture identity information, as shown in its paper.
  • ...and 3 more figures