Table of Contents
Fetching ...

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation

Yijiang Li, Sucheng Ren, Weipeng Deng, Yuzhi Xu, Ying Gao, Edith Ngai, Haohan Wang

TL;DR

The paper tackles out-of-distribution generalization under limited source-domain data by proposing a data-free paradigm that extrapolates novel domains through large language models (LLMs) and bridges text-based knowledge to pixel space via text-to-image synthesis. A theoretical bound shows how generalization can be controlled when replacing the unknown meta-distribution $\mu$ with a proxy $\mu'$ drawn from LLM-derived knowledge, with an explicit $\epsilon$-dependence: $\mathcal{L}^{\mu}(f) \le \hat{\mathcal{L}}^{\u03bc'}(f) + 2\mathcal{R}_{mn}(\mathcal{F}) + 2\mathcal{R}_{n}(\mathcal{F}) + 3\sqrt{\frac{\ln(2/\delta)}{2mn}} + 3\sqrt{\frac{\ln(2/\delta)}{n}} + \epsilon$. The method extracts class-level novel-domain knowledge via prompting (including Chain-of-Thought and role prompts), generates corresponding images with diffusion-based models, and trains on this synthetic data to improve DG performance, achieving strong results on DomainBed benchmarks in both standard and data-free settings. Overall, the approach demonstrates substantial OOD gains, scalability with more extrapolated domains, and a viable data-free route to robust generalization, while acknowledging biases and domain-specific limitations of foundation models.

Abstract

Out-of-distribution (OOD) generalization is a favorable yet challenging property for deep neural networks. The core challenges lie in the limited availability of source domains that help models learn an invariant representation from the spurious features. Various domain augmentation have been proposed but largely rely on interpolating existing domains and frequently face difficulties in creating truly "novel" domains. Humans, on the other hand, can easily extrapolate novel domains, thus, an intriguing question arises: How can neural networks extrapolate like humans and achieve OOD generalization? We introduce a novel approach to domain extrapolation that leverages reasoning ability and the extensive knowledge encapsulated within large language models (LLMs) to synthesize entirely new domains. Starting with the class of interest, we query the LLMs to extract relevant knowledge for these novel domains. We then bridge the gap between the text-centric knowledge derived from LLMs and the pixel input space of the model using text-to-image generation techniques. By augmenting the training set of domain generalization datasets with high-fidelity, photo-realistic images of these new domains, we achieve significant improvements over all existing methods, as demonstrated in both single and multi-domain generalization across various benchmarks. With the ability to extrapolate any domains for any class, our method has the potential to learn a generalized model for any task without any data. To illustrate, we put forth a much more difficult setting termed, data-free domain generalization, that aims to learn a generalized model in the absence of any collected data. Our empirical findings support the above argument and our methods exhibit commendable performance in this setting, even surpassing the supervised setting by approximately 1-2\% on datasets such as VLCS.

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation

TL;DR

The paper tackles out-of-distribution generalization under limited source-domain data by proposing a data-free paradigm that extrapolates novel domains through large language models (LLMs) and bridges text-based knowledge to pixel space via text-to-image synthesis. A theoretical bound shows how generalization can be controlled when replacing the unknown meta-distribution with a proxy drawn from LLM-derived knowledge, with an explicit -dependence: . The method extracts class-level novel-domain knowledge via prompting (including Chain-of-Thought and role prompts), generates corresponding images with diffusion-based models, and trains on this synthetic data to improve DG performance, achieving strong results on DomainBed benchmarks in both standard and data-free settings. Overall, the approach demonstrates substantial OOD gains, scalability with more extrapolated domains, and a viable data-free route to robust generalization, while acknowledging biases and domain-specific limitations of foundation models.

Abstract

Out-of-distribution (OOD) generalization is a favorable yet challenging property for deep neural networks. The core challenges lie in the limited availability of source domains that help models learn an invariant representation from the spurious features. Various domain augmentation have been proposed but largely rely on interpolating existing domains and frequently face difficulties in creating truly "novel" domains. Humans, on the other hand, can easily extrapolate novel domains, thus, an intriguing question arises: How can neural networks extrapolate like humans and achieve OOD generalization? We introduce a novel approach to domain extrapolation that leverages reasoning ability and the extensive knowledge encapsulated within large language models (LLMs) to synthesize entirely new domains. Starting with the class of interest, we query the LLMs to extract relevant knowledge for these novel domains. We then bridge the gap between the text-centric knowledge derived from LLMs and the pixel input space of the model using text-to-image generation techniques. By augmenting the training set of domain generalization datasets with high-fidelity, photo-realistic images of these new domains, we achieve significant improvements over all existing methods, as demonstrated in both single and multi-domain generalization across various benchmarks. With the ability to extrapolate any domains for any class, our method has the potential to learn a generalized model for any task without any data. To illustrate, we put forth a much more difficult setting termed, data-free domain generalization, that aims to learn a generalized model in the absence of any collected data. Our empirical findings support the above argument and our methods exhibit commendable performance in this setting, even surpassing the supervised setting by approximately 1-2\% on datasets such as VLCS.
Paper Structure (15 sections, 3 theorems, 24 equations, 7 figures, 6 tables)

This paper contains 15 sections, 3 theorems, 24 equations, 7 figures, 6 tables.

Key Result

theorem thmcountertheorem

With confidence at least $1 - 2\delta$ and for all $f \in \mathcal{F}$, we have

Figures (7)

  • Figure 1: Overall pipeline of our paradigm: Extrapolation of novel domains via the knowledge of LLMs, a novel learning paradigm where knowledge from LLMs assists the training of generalizable models via text-to-image models in a completely data-free fashion.
  • Figure 2: Knowledge extraction pipeline. We first employ various SOTA prompting methods: e.g. "Chain of Thought NEURIPS2022_9d560961" (CoT) prompting, role prompting to extract domains from LLM (Step 1) and automatically generate prompt for a Text-to-Image model. (Step 2)
  • Figure 3: Scaling the training dataset by adding more novel domains. Each novel domain consists of 64 images. To facilitate fair comparison, we scale the class template method by the same amount of images.
  • Figure 4: (a) Effectiveness of CLIP filtering. (b) Comparison between different knowledge extraction methods.
  • Figure 5: Examples of synthetic images conditioned on novel domain knowledge from LLM. The first two columns (i.e. art painting and cartoon) are selected from PACS datasets while the rest four columns are images generated based on the novel domains (i.e. cityscapes, etc) provided by LLMs.
  • ...and 2 more figures

Theorems & Definitions (8)

  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • remark thmcounterremark
  • theorem thmcountertheorem
  • proof
  • theorem thmcountertheorem
  • proof
  • proof