A Sampling-Based Domain Generalization Study with Diffusion Generative Models
Ye Zhu, Yu Wu, Duo Xu, Zhiwei Deng, Yan Yan, Olga Russakovsky
TL;DR
We address domain generalization for diffusion models in a few-shot setting by introducing a sampling-based approach that inverts unseen data to obtain latent priors. The method leverages deterministic DDIM denoising on frozen pre-trained diffusion models to sample from OOD latent priors, which are approximately Gaussian and separable from the in-domain prior, enabling synthesis of unseen-domain data without fine-tuning. Empirical results across multiple natural-image and astrophysical datasets show improved generation of unseen-domain samples, particularly when domain gaps are large, while preserving the original-domain quality. This tuning-free, data-efficient paradigm broadens the applicability of diffusion models to data-sparse scientific domains and suggests avenues for further cross-domain sampling strategies.
Abstract
In this work, we investigate the domain generalization capabilities of diffusion models in the context of synthesizing images that are distinct from the training data. Instead of fine-tuning, we tackle this challenge from a sampling-based perspective using frozen, pre-trained diffusion models. Specifically, we demonstrate that arbitrary out-of-domain (OOD) images establish Gaussian priors in the latent spaces of a given model after inversion, and that these priors are separable from those of the original training domain. This OOD latent property allows us to synthesize new images of the target unseen domain by discovering qualified OOD latent encodings in the inverted noisy spaces, without altering the pre-trained models. Our cross-model and cross-domain experiments show that the proposed sampling-based method can expand the latent space and generate unseen images without impairing the generation quality of the original domain. We also showcase a practical application of our approach using astrophysical data, highlighting the potential of this generalization paradigm in data-sparse fields such as scientific exploration.
