DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Ling Ge; Chunming Hu; Guanghui Ma; Jihong Liu; Hong Zhang

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Ling Ge, Chunming Hu, Guanghui Ma, Jihong Liu, Hong Zhang

TL;DR

A feedback-guided collaborative disentanglement method that seeks to purify input representations of classifiers, thereby mitigating mutual interference from multiple sources and alleviating the language pairs' language gap is devised.

Abstract

Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which is updated by all these languages. The extracted representations inevitably contain different source languages' information, which may disturb the learning of the language-specific classifiers. Additionally, due to the language gap, language-specific classifiers trained with source labels are unable to make accurate predictions for the target language. Both facts impair the model's performance. To address these challenges, we propose a Disentangled and Adaptive Network (DA-Net). Firstly, we devise a feedback-guided collaborative disentanglement method that seeks to purify input representations of classifiers, thereby mitigating mutual interference from multiple sources. Secondly, we propose a class-aware parallel adaptation method that aligns class-level distributions for each source-target language pair, thereby alleviating the language pairs' language gap. Experimental results on three different tasks involving 38 languages validate the effectiveness of our approach.

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

TL;DR

Abstract

Paper Structure (31 sections, 9 equations, 6 figures, 3 tables)

This paper contains 31 sections, 9 equations, 6 figures, 3 tables.

Introduction
Related Works
Methodology
Framework and Basic Pipeline
Model Framework
Shared Multilingual Encoder
Disentangler
Adaptor
Classifier
Feedback-guided Collaborative Disentanglement
Maximise Prediction Accuracy
Minimise Prediction Accuracy
Class-aware Parallel Adaptation
Class Pseudo Labels
Class-aware Distribution Alignment
...and 16 more sections

Figures (6)

Figure 1: Existing works align all languages through alignment loss $\mathcal{L}_{align}$, causing the model only retaining information invariant across all languages.
Figure 2: An illustration of our proposed model.
Figure 3: Collaborative disentanglement process.
Figure 4: Ablation Study.
Figure 5: Our method achieves more class-level alignment and learns better class-discriminative representations. The plus ($+$) and circles ($\bullet$) indicate representations of the source and target languages. Different colours represent different classes.
...and 1 more figures

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

TL;DR

Abstract

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)