Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
TL;DR
This work addresses zero-shot domain adaptation for pre-trained image generators by leveraging Score Distillation Sampling (SDS) from large text-to-image diffusion models as a training objective. It introduces diffusion-guided domain adaptation for StyleGAN2 by distilling diffusion priors into the generator, without requiring target-domain ground-truth images, and tackles mode collapse with a diffusion directional regularizer and a reconstruction regularizer. The approach delivers strong qualitative and quantitative gains, including significantly improved FID and competitive CLIP scores, especially on long prompts, and extends to 3D-aware generators (EG3D) and DreamBooth guidance. The method offers a controllable, scalable path to align generators with diverse text-described domains, enabling rapid, high-quality cross-domain synthesis with practical impact for creative AI systems.
Abstract
Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain? In this paper, we show that the classifier-free guidance can be leveraged as a critic and enable generators to distill knowledge from large-scale text-to-image diffusion models. Generators can be efficiently shifted into new domains indicated by text prompts without access to groundtruth samples from target domains. We demonstrate the effectiveness and controllability of our method through extensive experiments. Although not trained to minimize CLIP loss, our model achieves equally high CLIP scores and significantly lower FID than prior work on short prompts, and outperforms the baseline qualitatively and quantitatively on long and complicated prompts. To our best knowledge, the proposed method is the first attempt at incorporating large-scale pre-trained diffusion models and distillation sampling for text-driven image generator domain adaptation and gives a quality previously beyond possible. Moreover, we extend our work to 3D-aware style-based generators and DreamBooth guidance.
