Table of Contents
Fetching ...

Towards a Problem-Oriented Domain Adaptation Framework for Machine Learning

Philipp Spitzer, Dominik Martin, Laurin Eichberger, Niklas Kühl

TL;DR

The paper tackles the challenge of applying domain adaptation in machine learning by offering a problem-oriented framework that classifies domain shifts into five scenarios based on causality. Grounded in design science research, it provides scenario-specific guidance and a determination procedure, aiming to bridge theory and practice. The framework is validated across artificial and real-world datasets and through a user study with 100 participants, demonstrating its explanatory power and practical utility. Overall, it helps researchers and practitioners diagnose domain shift types, select promising approaches, and interpret failures more productively.

Abstract

Domain adaptation is a sub-field of machine learning that involves transferring knowledge from a source domain to perform the same task in the target domain. It is a typical challenge in machine learning that arises, e.g., when data is obtained from various sources or when using a data basis that changes over time. Recent advances in the field offer promising methods, but it is still challenging for researchers and practitioners to determine if domain adaptation is suitable for a given problem -- and, subsequently, to select the appropriate approach. This article employs design science research to develop a problem-oriented framework for domain adaptation, which is matured in three evaluation episodes. We describe a framework that distinguishes between five domain adaptation scenarios, provides recommendations for addressing each scenario, and offers guidelines for determining if a problem falls into one of these scenarios. During the multiple evaluation episodes, the framework is tested on artificial and real-world datasets and an experimental study involving 100 participants. The evaluation demonstrates that the framework has the explanatory power to capture any domain adaptation problem effectively. In summary, we provide clear guidance for researchers and practitioners who want to employ domain adaptation but lack in-depth knowledge of the possibilities.

Towards a Problem-Oriented Domain Adaptation Framework for Machine Learning

TL;DR

The paper tackles the challenge of applying domain adaptation in machine learning by offering a problem-oriented framework that classifies domain shifts into five scenarios based on causality. Grounded in design science research, it provides scenario-specific guidance and a determination procedure, aiming to bridge theory and practice. The framework is validated across artificial and real-world datasets and through a user study with 100 participants, demonstrating its explanatory power and practical utility. Overall, it helps researchers and practitioners diagnose domain shift types, select promising approaches, and interpret failures more productively.

Abstract

Domain adaptation is a sub-field of machine learning that involves transferring knowledge from a source domain to perform the same task in the target domain. It is a typical challenge in machine learning that arises, e.g., when data is obtained from various sources or when using a data basis that changes over time. Recent advances in the field offer promising methods, but it is still challenging for researchers and practitioners to determine if domain adaptation is suitable for a given problem -- and, subsequently, to select the appropriate approach. This article employs design science research to develop a problem-oriented framework for domain adaptation, which is matured in three evaluation episodes. We describe a framework that distinguishes between five domain adaptation scenarios, provides recommendations for addressing each scenario, and offers guidelines for determining if a problem falls into one of these scenarios. During the multiple evaluation episodes, the framework is tested on artificial and real-world datasets and an experimental study involving 100 participants. The evaluation demonstrates that the framework has the explanatory power to capture any domain adaptation problem effectively. In summary, we provide clear guidance for researchers and practitioners who want to employ domain adaptation but lack in-depth knowledge of the possibilities.
Paper Structure (22 sections, 8 equations, 19 figures, 9 tables)

This paper contains 22 sections, 8 equations, 19 figures, 9 tables.

Figures (19)

  • Figure 1: The number of search results for domain adaptation in Google Scholar by year indicates a growing interest in the topic.
  • Figure 2: Domain adaptation results for CyCada by hoffman2018. The adversarial approach aligns the source domain (GTA5, SVHN) to more closely resemble the target domain (CityScapes, MNIST) and achieves higher accuracy in the respective computer vision tasks compared to only learning from the source domain.
  • Figure 3: Answer rates for questions concerning domain adaptation on the StackExchange network are significantly below the average of 70%. The questions are categorized by the topics: implementation (impl), statistics (stat), artificial intelligence (ai), data science (ds), mathematics (math), and computer science (cs). The data is obtained using the official API as of April 2023.
  • Figure 4: The road map illustrates how the artifact, i.e., the framework, is constructed and improved by the two interconnected steps: theory building and implementing/evaluating. The process can be repeated if finished to incorporate insights from a previous iteration. Alternatively, intermediary results can lead to an immediate revisiting of earlier steps.
  • Figure 5: In $Y \rightarrow X$ systems, a prior shift occurs if the marginal label distributions (left) change, but the class conditionals (right) remain the same.
  • ...and 14 more figures