Table of Contents
Fetching ...

Domain Generalization: A Survey

Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy

TL;DR

Domain Generalization tackles generalization to unseen domains using only source data, addressing the gap left by domain adaptation and transfer learning. The paper surveys a decade of methods, organizing them into domain alignment, meta-learning, data augmentation, ensembles, self-supervised learning, disentangled representations, regularization, and reinforcement learning, and discusses theory and evaluation. It formalizes problem definitions, datasets, evaluation protocols, and relationships to related topics, providing a unified view of progress and limitations. The work highlights practical implications for CV, speech, NLP, medical imaging, and RL and offers directions for architecture, learning strategies, and benchmarks to advance robust OOD generalization.

Abstract

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d.~assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Over the last ten years, research in DG has made great progress, leading to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, to name a few; DG has also been studied in various application areas including computer vision, speech recognition, natural language processing, medical imaging, and reinforcement learning. In this paper, for the first time a comprehensive literature review in DG is provided to summarize the developments over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other relevant fields like domain adaptation and transfer learning. Then, we conduct a thorough review into existing methods and theories. Finally, we conclude this survey with insights and discussions on future research directions.

Domain Generalization: A Survey

TL;DR

Domain Generalization tackles generalization to unseen domains using only source data, addressing the gap left by domain adaptation and transfer learning. The paper surveys a decade of methods, organizing them into domain alignment, meta-learning, data augmentation, ensembles, self-supervised learning, disentangled representations, regularization, and reinforcement learning, and discusses theory and evaluation. It formalizes problem definitions, datasets, evaluation protocols, and relationships to related topics, providing a unified view of progress and limitations. The work highlights practical implications for CV, speech, NLP, medical imaging, and RL and offers directions for architecture, learning strategies, and benchmarks to advance robust OOD generalization.

Abstract

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d.~assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Over the last ten years, research in DG has made great progress, leading to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, to name a few; DG has also been studied in various application areas including computer vision, speech recognition, natural language processing, medical imaging, and reinforcement learning. In this paper, for the first time a comprehensive literature review in DG is provided to summarize the developments over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other relevant fields like domain adaptation and transfer learning. Then, we conduct a thorough review into existing methods and theories. Finally, we conclude this survey with insights and discussions on future research directions.

Paper Structure

This paper contains 24 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Example images from three domain generalization benchmarks manifesting different types of domain shift. In (a), the domain shift mainly corresponds to changes in font style, color and background. In (b), dataset-specific biases are clear, which are caused by changes in environment/scene and viewpoint. In (c), image style changes are the main reason for domain shift.
  • Figure 2: Domain alignment is commonly applied to a pair of source domains, either in the feature space (orange arrows) or the classifier's output (green arrows), or both.
  • Figure 3: A commonly used meta-learning paradigm li2018learning in domain generalization. The source domains (i.e., art, photo and cartoon from PACS li2017deeper) are divided into disjoint meta-source and meta-target domains. The outer learning, which simulates domain shift using the meta-target data, back-propagates the gradients all the way back to the base parameters such that the model learned by the inner algorithm with the meta-source data improves the outer objective. The red arrows in this figure denote the gradient flow through the second-order differentiation.
  • Figure 4: Based on the formulation of the transformation $A(\cdot)$, existing data augmentation methods can be categorized into four groups. a) The first group enhances the generalization of the classifier $f$ by applying hand-engineered image transformations like random crop or color augmentation to simulating domain shift. b) The second group is based on adversarial gradients obtained from either a category classifier ($h = f$) or a domain classifier. c) The third group models $A(\cdot)$ using neural networks, such as random CNNs xu2021robust, an off-the-shelf style transfer model zhou2021stylematch, or a learnable image generator zhou2020learning. d) The final group injects perturbation into intermediate features in the task model.
  • Figure 5: Common image transformations used as data augmentation in domain generalization volpi2019addressingotalora2019stainingchen2020improvingzhang2020generalizing.
  • ...and 1 more figures