Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation
Yijiang Li, Sucheng Ren, Weipeng Deng, Yuzhi Xu, Ying Gao, Edith Ngai, Haohan Wang
TL;DR
The paper tackles out-of-distribution generalization under limited source-domain data by proposing a data-free paradigm that extrapolates novel domains through large language models (LLMs) and bridges text-based knowledge to pixel space via text-to-image synthesis. A theoretical bound shows how generalization can be controlled when replacing the unknown meta-distribution $\mu$ with a proxy $\mu'$ drawn from LLM-derived knowledge, with an explicit $\epsilon$-dependence: $\mathcal{L}^{\mu}(f) \le \hat{\mathcal{L}}^{\u03bc'}(f) + 2\mathcal{R}_{mn}(\mathcal{F}) + 2\mathcal{R}_{n}(\mathcal{F}) + 3\sqrt{\frac{\ln(2/\delta)}{2mn}} + 3\sqrt{\frac{\ln(2/\delta)}{n}} + \epsilon$. The method extracts class-level novel-domain knowledge via prompting (including Chain-of-Thought and role prompts), generates corresponding images with diffusion-based models, and trains on this synthetic data to improve DG performance, achieving strong results on DomainBed benchmarks in both standard and data-free settings. Overall, the approach demonstrates substantial OOD gains, scalability with more extrapolated domains, and a viable data-free route to robust generalization, while acknowledging biases and domain-specific limitations of foundation models.
Abstract
Out-of-distribution (OOD) generalization is a favorable yet challenging property for deep neural networks. The core challenges lie in the limited availability of source domains that help models learn an invariant representation from the spurious features. Various domain augmentation have been proposed but largely rely on interpolating existing domains and frequently face difficulties in creating truly "novel" domains. Humans, on the other hand, can easily extrapolate novel domains, thus, an intriguing question arises: How can neural networks extrapolate like humans and achieve OOD generalization? We introduce a novel approach to domain extrapolation that leverages reasoning ability and the extensive knowledge encapsulated within large language models (LLMs) to synthesize entirely new domains. Starting with the class of interest, we query the LLMs to extract relevant knowledge for these novel domains. We then bridge the gap between the text-centric knowledge derived from LLMs and the pixel input space of the model using text-to-image generation techniques. By augmenting the training set of domain generalization datasets with high-fidelity, photo-realistic images of these new domains, we achieve significant improvements over all existing methods, as demonstrated in both single and multi-domain generalization across various benchmarks. With the ability to extrapolate any domains for any class, our method has the potential to learn a generalized model for any task without any data. To illustrate, we put forth a much more difficult setting termed, data-free domain generalization, that aims to learn a generalized model in the absence of any collected data. Our empirical findings support the above argument and our methods exhibit commendable performance in this setting, even surpassing the supervised setting by approximately 1-2\% on datasets such as VLCS.
