Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Jincheng Zhong, Xiangcheng Zhang, Jianmin Wang, Mingsheng Long
TL;DR
Domain Guidance (DoG) introduces a simple transfer mechanism for pre-trained diffusion models by treating domain transfer as domain-conditioned generation. It keeps the original model as an unconditional guide while training a domain-specific conditional branch, enabling sampling to be steered toward the target domain without retraining the entire network. Empirical and theoretical analyses show DoG leverages pre-trained knowledge to improve domain alignment and reduce out-of-domain sampling, outperforming standard CFG-based fine-tuning across seven benchmarks and enabling seamless integration with CFG-finetuned or LoRA-enhanced models. The approach offers practical gains in generation quality (FID and FD_DINOv2) and remains computationally efficient during sampling, highlighting its utility for rapid, robust domain adaptation of diffusion models.
Abstract
Recent advancements in diffusion models have revolutionized generative modeling. However, the impressive and vivid outputs they produce often come at the cost of significant model scaling and increased computational demands. Consequently, building personalized diffusion models based on off-the-shelf models has emerged as an appealing alternative. In this paper, we introduce a novel perspective on conditional generation for transferring a pre-trained model. From this viewpoint, we propose *Domain Guidance*, a straightforward transfer approach that leverages pre-trained knowledge to guide the sampling process toward the target domain. Domain Guidance shares a formulation similar to advanced classifier-free guidance, facilitating better domain alignment and higher-quality generations. We provide both empirical and theoretical analyses of the mechanisms behind Domain Guidance. Our experimental results demonstrate its substantial effectiveness across various transfer benchmarks, achieving over a 19.6% improvement in FID and a 23.4% improvement in FD$_\text{DINOv2}$ compared to standard fine-tuning. Notably, existing fine-tuned models can seamlessly integrate Domain Guidance to leverage these benefits, without additional training.
