Table of Contents
Fetching ...

Domain Generalization through Meta-Learning: A Survey

Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt

TL;DR

This survey addresses the challenge of domain generalization (DG) in the presence of distribution shifts by surveying meta-learning approaches that enable fast adaptation and robust generalization to unseen domains without target-domain data. It introduces a two-axis taxonomy that separates methods by how they generalize representations (minimizing inter-domain distances vs maximizing intra-domain diversity) and how they train discriminative classifiers (intra-class compactness vs inter-class separation). The review covers prominent methodologies (MLDG, MetaReg, Feature-Critic Networks, episodic DG, invariant representation learning, semantic feature regularization, and more), datasets, evaluation protocols, and practical applications, and discusses open challenges and promising directions such as causal-informed DG, memory-based strategies, and federated settings. Together, these contributions offer a structured roadmap for researchers and practitioners to design and evaluate meta-learning solutions that generalize across diverse, unseen domains. The work underscores the practical impact of DG via meta-learning in enabling zero-shot transfer, reducing data collection costs, and improving robustness in real-world AI systems.

Abstract

Deep neural networks (DNNs) have revolutionized artificial intelligence but often lack performance when faced with out-of-distribution (OOD) data, a common scenario due to the inevitable domain shifts in real-world applications. This limitation stems from the common assumption that training and testing data share the same distribution--an assumption frequently violated in practice. Despite their effectiveness with large amounts of data and computational power, DNNs struggle with distributional shifts and limited labeled data, leading to overfitting and poor generalization across various tasks and domains. Meta-learning presents a promising approach by employing algorithms that acquire transferable knowledge across various tasks for fast adaptation, eliminating the need to learn each task from scratch. This survey paper delves into the realm of meta-learning with a focus on its contribution to domain generalization. We first clarify the concept of meta-learning for domain generalization and introduce a novel taxonomy based on the feature extraction strategy and the classifier learning methodology, offering a granular view of methodologies. Additionally, we present a decision graph to assist readers in navigating the taxonomy based on data availability and domain shifts, enabling them to select and develop a proper model tailored to their specific problem requirements. Through an exhaustive review of existing methods and underlying theories, we map out the fundamentals of the field. Our survey provides practical insights and an informed discussion on promising research directions.

Domain Generalization through Meta-Learning: A Survey

TL;DR

This survey addresses the challenge of domain generalization (DG) in the presence of distribution shifts by surveying meta-learning approaches that enable fast adaptation and robust generalization to unseen domains without target-domain data. It introduces a two-axis taxonomy that separates methods by how they generalize representations (minimizing inter-domain distances vs maximizing intra-domain diversity) and how they train discriminative classifiers (intra-class compactness vs inter-class separation). The review covers prominent methodologies (MLDG, MetaReg, Feature-Critic Networks, episodic DG, invariant representation learning, semantic feature regularization, and more), datasets, evaluation protocols, and practical applications, and discusses open challenges and promising directions such as causal-informed DG, memory-based strategies, and federated settings. Together, these contributions offer a structured roadmap for researchers and practitioners to design and evaluate meta-learning solutions that generalize across diverse, unseen domains. The work underscores the practical impact of DG via meta-learning in enabling zero-shot transfer, reducing data collection costs, and improving robustness in real-world AI systems.

Abstract

Deep neural networks (DNNs) have revolutionized artificial intelligence but often lack performance when faced with out-of-distribution (OOD) data, a common scenario due to the inevitable domain shifts in real-world applications. This limitation stems from the common assumption that training and testing data share the same distribution--an assumption frequently violated in practice. Despite their effectiveness with large amounts of data and computational power, DNNs struggle with distributional shifts and limited labeled data, leading to overfitting and poor generalization across various tasks and domains. Meta-learning presents a promising approach by employing algorithms that acquire transferable knowledge across various tasks for fast adaptation, eliminating the need to learn each task from scratch. This survey paper delves into the realm of meta-learning with a focus on its contribution to domain generalization. We first clarify the concept of meta-learning for domain generalization and introduce a novel taxonomy based on the feature extraction strategy and the classifier learning methodology, offering a granular view of methodologies. Additionally, we present a decision graph to assist readers in navigating the taxonomy based on data availability and domain shifts, enabling them to select and develop a proper model tailored to their specific problem requirements. Through an exhaustive review of existing methods and underlying theories, we map out the fundamentals of the field. Our survey provides practical insights and an informed discussion on promising research directions.
Paper Structure (42 sections, 19 equations, 14 figures, 3 tables)

This paper contains 42 sections, 19 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: An illustrative diagram of the meta-learning taxonomy for domain generalization. The quadrant chart highlights two principal axes: the first axis (the generalizability axis) represents the strategy of the feature extractor, contrasting the Minimization of Inter-Domain Distances with the Maximization of Intra-Domain Distances; the second axis (the discriminability aixs) depicts the classifier training process, distinguishing between the Minimization of Intra-Class Distances and Maximization of Inter-Class Distances. This diagram visually organizes domain generalization approaches, illustrating their distinct mechanisms for promoting generalization across unseen domains.
  • Figure 2: A decision graph illustrating how to apply the taxonomy of meta-learning approaches for domain generalization. The decision graph categorizes techniques based on the two key aspects: the generalizability axis, which focuses on the feature extractor's strategy (minimizing inter-domain distances or maximizing intra-domain distances), and the discriminability axis, which focuses on the classifier's training process (minimizing intra-class distances or maximizing inter-class distances). This decision graph helps the reader navigate through the taxonomy, providing use cases, strengths, and weaknesses for each category to make the concepts more tangible, applicable, and actionable.
  • Figure 3: Visual representation of the Feature-Critic Networks for heterogeneous domain generalization, depicting the base network's feature extraction guided by an auxiliary loss from the feature-critic networks to promote domain-invariant feature extraction. The domain-invariant features are generated by minimizing the inter-domain distances.
  • Figure 4: Overview of Episodic Training for DG framework, illustrating the regularization process where a feature extractor is trained with classifiers from different domains and vice versa, promoting out-of-distribution robustness.
  • Figure 5: Depiction of the meta-learning for invariant representation method. (a) Compared to the ERM baseline, (b) domain invariance learning decreases the discrepancy among source domains and excels in source-domain classification, yet it may still result in significant errors on the target domain. (c) The proposed approach employs bilevel meta-learning to further minimize the discrepancy between the target and source domains, enabling a hypothesis learned from the source domains to generalize effectively to the target domain.
  • ...and 9 more figures