Table of Contents
Fetching ...

TACIT: A Target-Agnostic Feature Disentanglement Framework for Cross-Domain Text Classification

Rui Song, Fausto Giunchiglia, Yingji Li, Mingjie Tian, Hao Xu

TL;DR

The paper tackles cross-domain text classification when the target domain is agnostic and unlabeled data are unavailable, a setting where conventional domain-adaptation methods falter. It introduces TACIT, a target-agnostic feature disentanglement framework that uses a VAE-based student to separate robust and unrobust features, complemented by an easy-samples teacher and a unrobust feature distillation task to reinforce decoupling. Empirical results on Amazon reviews show TACIT achieving competitive performance without any target-domain data, sometimes matching or surpassing state-of-the-art baselines. This work reduces data requirements for cross-domain deployment and enables robust generalization in practical, target-agnostic scenarios.

Abstract

Cross-domain text classification aims to transfer models from label-rich source domains to label-poor target domains, giving it a wide range of practical applications. Many approaches promote cross-domain generalization by capturing domain-invariant features. However, these methods rely on unlabeled samples provided by the target domains, which renders the model ineffective when the target domain is agnostic. Furthermore, the models are easily disturbed by shortcut learning in the source domain, which also hinders the improvement of domain generalization ability. To solve the aforementioned issues, this paper proposes TACIT, a target domain agnostic feature disentanglement framework which adaptively decouples robust and unrobust features by Variational Auto-Encoders. Additionally, to encourage the separation of unrobust features from robust features, we design a feature distillation task that compels unrobust features to approximate the output of the teacher. The teacher model is trained with a few easy samples that are easy to carry potential unknown shortcuts. Experimental results verify that our framework achieves comparable results to state-of-the-art baselines while utilizing only source domain data.

TACIT: A Target-Agnostic Feature Disentanglement Framework for Cross-Domain Text Classification

TL;DR

The paper tackles cross-domain text classification when the target domain is agnostic and unlabeled data are unavailable, a setting where conventional domain-adaptation methods falter. It introduces TACIT, a target-agnostic feature disentanglement framework that uses a VAE-based student to separate robust and unrobust features, complemented by an easy-samples teacher and a unrobust feature distillation task to reinforce decoupling. Empirical results on Amazon reviews show TACIT achieving competitive performance without any target-domain data, sometimes matching or surpassing state-of-the-art baselines. This work reduces data requirements for cross-domain deployment and enables robust generalization in practical, target-agnostic scenarios.

Abstract

Cross-domain text classification aims to transfer models from label-rich source domains to label-poor target domains, giving it a wide range of practical applications. Many approaches promote cross-domain generalization by capturing domain-invariant features. However, these methods rely on unlabeled samples provided by the target domains, which renders the model ineffective when the target domain is agnostic. Furthermore, the models are easily disturbed by shortcut learning in the source domain, which also hinders the improvement of domain generalization ability. To solve the aforementioned issues, this paper proposes TACIT, a target domain agnostic feature disentanglement framework which adaptively decouples robust and unrobust features by Variational Auto-Encoders. Additionally, to encourage the separation of unrobust features from robust features, we design a feature distillation task that compels unrobust features to approximate the output of the teacher. The teacher model is trained with a few easy samples that are easy to carry potential unknown shortcuts. Experimental results verify that our framework achieves comparable results to state-of-the-art baselines while utilizing only source domain data.
Paper Structure (22 sections, 1 theorem, 9 equations, 5 figures, 4 tables)

This paper contains 22 sections, 1 theorem, 9 equations, 5 figures, 4 tables.

Key Result

Theorem 1

Suppose a set of independent samples $\{X_1, X_2, ... ,X_n\}$ follow the normal distribution $\mathcal{N}(\mu, \sigma^2)$, the mean and variance of the samples are independent of each other.

Figures (5)

  • Figure 1: TACIT's overall architecture and processing flow. It consists of two main steps and three tasks. In Step 1, an underfitting model selects a subset of easy samples from the source domain based on the confidence. Subsequently, such samples are used to train a teacher model. In Step 2, the output features of the base model are fed into VAE for disntanglement. The robust feature $z_\mu$ is used to predict the sample labels. Then, the unrobust feature $z_\sigma$ is scheduled to be learned from the teacher's output $\hat{z}$ through feature distillation. Finally, cross-entropy loss, VAE loss and distillation loss are used to co-optimize the model. indicates that model parameters are not updated during training.
  • Figure 2: Comparison of single-source and multi-source experimental results on similar data sets K and E.
  • Figure 3: The changes of loss on fold-1 with Books and DVDs as source domains during the model training process. Different styles of lines represent different datasets as well as loss values.
  • Figure 4: Comparison of ablation results of different cross-domain generalization tasks, where different colors and styles of bars indicate different TACIT variants.
  • Figure 5: Feature visualisation results of $z_\mu$ and $z_\sigma$ for TACIT and the two corresponding variants TACIT$_{-distill}$ and TACIT$_{-vae}$ on B$\to$D, where the green nodes indicate $z_\sigma$ and the purple nodes indicate $z_\mu$.

Theorems & Definitions (2)

  • Theorem 1
  • proof