Zero-Shot Head Swapping in Real-World Scenarios

Taewoong Kang; Sohyun Jeong; Hyojin Jang; Jaegul Choo

Zero-Shot Head Swapping in Real-World Scenarios

Taewoong Kang, Sohyun Jeong, Hyojin Jang, Jaegul Choo

TL;DR

This work addresses the challenge of zero-shot head swapping in real-world images that include full heads and upper bodies with diverse poses. The authors introduce HID, a diffusion-based framework that integrates an IOMask for automatic context-aware masking and a Hair Injection Module to preserve hairstyle details, enabling seamless head-body fusion. By leveraging DDIM inversion, identity-oriented embedding fusion (PhotoMaker V2), and ControlNet-driven pose conditioning, HID achieves state-of-the-art results on challenging data, outperforming baselines in hair fidelity, identity preservation, and image quality. The approach reduces the need for cropping or post-hoc compositing, improving practicality for real-world applications in media, avatars, and editing workflows.

Abstract

With growing demand in media and social networks for personalized images, the need for advanced head-swapping techniques, integrating an entire head from the head image with the body from the body image, has increased. However, traditional head swapping methods heavily rely on face-centered cropped data with primarily frontal facing views, which limits their effectiveness in real world applications. Additionally, their masking methods, designed to indicate regions requiring editing, are optimized for these types of dataset but struggle to achieve seamless blending in complex situations, such as when the original data includes features like long hair extending beyond the masked area. To overcome these limitations and enhance adaptability in diverse and complex scenarios, we propose a novel head swapping method, HID, that is robust to images including the full head and the upper body, and handles from frontal to side views, while automatically generating context aware masks. For automatic mask generation, we introduce the IOMask, which enables seamless blending of the head and body, effectively addressing integration challenges. We further introduce the hair injection module to capture hair details with greater precision. Our experiments demonstrate that the proposed approach achieves state-of-the-art performance in head swapping, providing visually consistent and realistic results across a wide range of challenging conditions.

Zero-Shot Head Swapping in Real-World Scenarios

TL;DR

Abstract

Zero-Shot Head Swapping in Real-World Scenarios

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)