Table of Contents
Fetching ...

DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization

Feng Hou, Jin Yuan, Ying Yang, Yang Liu, Yang Zhang, Cheng Zhong, Zhongchao Shi, Jianping Fan, Yong Rui, Zhiqiang He

TL;DR

This work tackles real-world distribution shifts by introducing DomainVerse, a large-scale synthetic benchmark with hierarchical, decomposable domain shifts across 390 fine-grained combinations and 18 coarse domains, generated in a Unity-based environment. It reframes domain generalization as Adaptive Domain Generalization (ADG) for vision-language models and proposes tuning-free methods Domain CLIP and Domain++ CLIP that inject domain priors into prompts, eliminating costly fine-tuning. Across tuning-free, test-time adaptation, traditional DG benchmarks, and synthetic-to-real transfer to DWild, the proposed methods consistently improve over zero-shot CLIP and post-processing baselines, achieving SOTA-like performance on DomainVerse and competitive gains on PACS and Office-Home. The results demonstrate the practicality of using domain-aware prompts and LLM-generated domain descriptors to bridge real-world distribution gaps in zero-shot and test-time settings, with DomainVerse serving as a robust evaluation platform for ADG research.

Abstract

Traditional cross-domain tasks, including domain adaptation and domain generalization, rely heavily on training model by source domain data. With the recent advance of vision-language models (VLMs), viewed as natural source models, the cross-domain task changes to directly adapt the pre-trained source model to arbitrary target domains equipped with prior domain knowledge, and we name this task Adaptive Domain Generalization (ADG). However, current cross-domain datasets have many limitations, such as unrealistic domains, unclear domain definitions, and the inability to fine-grained domain decomposition, which drives us to establish a novel dataset DomainVerse for ADG. Benefiting from the introduced hierarchical definition of domain shifts, DomainVerse consists of about 0.5 million images from 390 fine-grained realistic domains. With the help of the constructed DomainVerse and VLMs, we propose two methods called Domain CLIP and Domain++ CLIP for tuning-free adaptive domain generalization. Extensive and comprehensive experiments demonstrate the significance of the dataset and the effectiveness of the proposed methods.

DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization

TL;DR

This work tackles real-world distribution shifts by introducing DomainVerse, a large-scale synthetic benchmark with hierarchical, decomposable domain shifts across 390 fine-grained combinations and 18 coarse domains, generated in a Unity-based environment. It reframes domain generalization as Adaptive Domain Generalization (ADG) for vision-language models and proposes tuning-free methods Domain CLIP and Domain++ CLIP that inject domain priors into prompts, eliminating costly fine-tuning. Across tuning-free, test-time adaptation, traditional DG benchmarks, and synthetic-to-real transfer to DWild, the proposed methods consistently improve over zero-shot CLIP and post-processing baselines, achieving SOTA-like performance on DomainVerse and competitive gains on PACS and Office-Home. The results demonstrate the practicality of using domain-aware prompts and LLM-generated domain descriptors to bridge real-world distribution gaps in zero-shot and test-time settings, with DomainVerse serving as a robust evaluation platform for ADG research.

Abstract

Traditional cross-domain tasks, including domain adaptation and domain generalization, rely heavily on training model by source domain data. With the recent advance of vision-language models (VLMs), viewed as natural source models, the cross-domain task changes to directly adapt the pre-trained source model to arbitrary target domains equipped with prior domain knowledge, and we name this task Adaptive Domain Generalization (ADG). However, current cross-domain datasets have many limitations, such as unrealistic domains, unclear domain definitions, and the inability to fine-grained domain decomposition, which drives us to establish a novel dataset DomainVerse for ADG. Benefiting from the introduced hierarchical definition of domain shifts, DomainVerse consists of about 0.5 million images from 390 fine-grained realistic domains. With the help of the constructed DomainVerse and VLMs, we propose two methods called Domain CLIP and Domain++ CLIP for tuning-free adaptive domain generalization. Extensive and comprehensive experiments demonstrate the significance of the dataset and the effectiveness of the proposed methods.
Paper Structure (17 sections, 4 figures, 5 tables)

This paper contains 17 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: In the upper part, we demonstrate the paradigm of the ADG task and traditional UDA and DG. In the middle part, we present the issues of unrealistic domains and unclear domain definitions in the current cross-domain datasets. In the bottom part, we display the common real-world domain shifts to illustrate the difficulty of decoupling these characteristics.
  • Figure 2: The DomainVerse statistics of different shifts and domains are shown in this figure. Notably, most domains achieve a balanced distribution, except for snowy solely exists in winter, and occlusion achieves relative balance due to different object sizes. The detailed statistics are shown in the Appendix.
  • Figure 3: We showcase the DomainVerse dataset with categories arranged horizontally and domains arranged vertically. More samples can be found in the Appendix.
  • Figure 4: Adaptive Domain Generalization. Left: The generation of domain descriptions. Right: The pipelines of adaptative domain generalization and test-time adaptation.