A Recent Survey of Heterogeneous Transfer Learning
Runxue Bao, Yiming Sun, Yuhe Gao, Jindong Wang, Qiang Yang, Zhi-Hong Mao, Ye Ye
TL;DR
This survey addresses HTL, where knowledge is transferred across domains with differing feature or label spaces. It consolidates 60+ methods into a unified framework, partitioning them into data-based (instance-based and feature representation-based) and model-based (parameter regularization and parameter tuning) categories, and discusses core assumptions and algorithms. The review covers applications across NLP, computer vision, multimodality, and biomedicine, highlighting recent progress such as transformer-based models and multimodal learning. It also identifies limitations and provides guidance for future research, emphasizing unsupervised and online HTL, large foundation-models, knowledge distillation in heterogeneous settings, and the importance of interpretability in HTL systems.
Abstract
The application of transfer learning, leveraging knowledge from source domains to enhance model performance in a target domain, has significantly grown, supporting diverse real-world applications. Its success often relies on shared knowledge between domains, typically required in these methodologies. Commonly, methods assume identical feature and label spaces in both domains, known as homogeneous transfer learning. However, this is often impractical as source and target domains usually differ in these spaces, making precise data matching challenging and costly. Consequently, heterogeneous transfer learning (HTL), which addresses these disparities, has become a vital strategy in various tasks. In this paper, we offer an extensive review of over 60 HTL methods, covering both data-based and model-based approaches. We describe the key assumptions and algorithms of these methods and systematically categorize them into instance-based, feature representation-based, parameter regularization, and parameter tuning techniques. Additionally, we explore applications in natural language processing, computer vision, multimodal learning, and biomedicine, aiming to deepen understanding and stimulate further research in these areas. Our paper includes recent advancements in HTL, such as the introduction of transformer-based models and multimodal learning techniques, ensuring the review captures the latest developments in the field. We identify key limitations in current HTL studies and offer systematic guidance for future research, highlighting areas needing further exploration and suggesting potential directions for advancing the field.
