FastFace: Tuning Identity Preservation in Distilled Diffusion via Guidance and Attention
Sergey Karpukhin, Vadim Titov, Andrey Kuznetsov, Aibek Alanov
TL;DR
FastFace addresses the challenge of adapting pretrained id-preserving adapters to distilled diffusion models without retraining, enabling real-time, few-step generation. It introduces two complementary components—Decoupled Classifier-Free Guidance (DCG) and Attention Manipulation (AM)—and integrates them into a universal training-free framework that handles stylistic and realistic generation scenarios. DCG mathematically decouples identity conditioning from text conditioning and employs scheduling and rescaling to stabilize few-step inference; AM focuses attention maps to facial regions using scale-power and scheduled-softmask transforms, improving identity fidelity with minimal artifacts. The authors provide a disentangled evaluation dataset and demonstrate consistent improvements in identity preservation, prompt alignment, and image quality across multiple distilled checkpoints, highlighting FastFace’s practical impact for real-time, personalized diffusion-based generation.
Abstract
In latest years plethora of identity-preserving adapters for a personalized generation with diffusion models have been released. Their main disadvantage is that they are dominantly trained jointly with base diffusion models, which suffer from slow multi-step inference. This work aims to tackle the challenge of training-free adaptation of pretrained ID-adapters to diffusion models accelerated via distillation - through careful re-design of classifier-free guidance for few-step stylistic generation and attention manipulation mechanisms in decoupled blocks to improve identity similarity and fidelity, we propose universal FastFace framework. Additionally, we develop a disentangled public evaluation protocol for id-preserving adapters.
