Table of Contents
Fetching ...

Unsupervised Robust Domain Adaptation: Paradigm, Theory and Algorithm

Fuxiang Huang, Xiaowei Fu, Shiyu Ye, Lina Ma, Wen Li, Xinbo Gao, David Zhang, Lei Zhang

TL;DR

The paper addresses robustness of unsupervised domain adaptation (UDA) under adversarial perturbations by identifying entanglement between transfer learning and adversarial training in the UDA+VAT setup. It introduces unsupervised robust domain adaptation (URDA) with a formal generalization bound based on an ideal target classifier $h_t^*$ and proposes a practical two-step algorithm, Disentangled Adversarial Robustness Training (DART), to disentangle transfer and robustness training. Empirical results across Office-31, Office-Home, VisDA-2017, DomainNet, and Amazon Reviews show that DART substantially improves adversarial robustness while preserving clean transfer performance, outperforming standard UDA, AT-based defenses, and other robust UDA methods. The work provides a theoretically grounded, easy-to-implement firewall for UDA against attacks, with implications for deploying robust cross-domain models in real-world settings. However, it assumes knowledge of the attack family and leaves open questions on universal robustness to agnostic attacks.

Abstract

Unsupervised domain adaptation (UDA) aims to transfer knowledge from a label-rich source domain to an unlabeled target domain by addressing domain shifts. Most UDA approaches emphasize transfer ability, but often overlook robustness against adversarial attacks. Although vanilla adversarial training (VAT) improves the robustness of deep neural networks, it has little effect on UDA. This paper focuses on answering three key questions: 1) Why does VAT, known for its defensive effectiveness, fail in the UDA paradigm? 2) What is the generalization bound theory under attacks and how does it evolve from classical UDA theory? 3) How can we implement a robustification training procedure without complex modifications? Specifically, we explore and reveal the inherent entanglement challenge in general UDA+VAT paradigm, and propose an unsupervised robust domain adaptation (URDA) paradigm. We further derive the generalization bound theory of the URDA paradigm so that it can resist adversarial noise and domain shift. To the best of our knowledge, this is the first time to establish the URDA paradigm and theory. We further introduce a simple, novel yet effective URDA algorithm called Disentangled Adversarial Robustness Training (DART), a two-step training procedure that ensures both transferability and robustness. DART first pre-trains an arbitrary UDA model, and then applies an instantaneous robustification post-training step via disentangled distillation.Experiments on four benchmark datasets with/without attacks show that DART effectively enhances robustness while maintaining domain adaptability, and validate the URDA paradigm and theory.

Unsupervised Robust Domain Adaptation: Paradigm, Theory and Algorithm

TL;DR

The paper addresses robustness of unsupervised domain adaptation (UDA) under adversarial perturbations by identifying entanglement between transfer learning and adversarial training in the UDA+VAT setup. It introduces unsupervised robust domain adaptation (URDA) with a formal generalization bound based on an ideal target classifier and proposes a practical two-step algorithm, Disentangled Adversarial Robustness Training (DART), to disentangle transfer and robustness training. Empirical results across Office-31, Office-Home, VisDA-2017, DomainNet, and Amazon Reviews show that DART substantially improves adversarial robustness while preserving clean transfer performance, outperforming standard UDA, AT-based defenses, and other robust UDA methods. The work provides a theoretically grounded, easy-to-implement firewall for UDA against attacks, with implications for deploying robust cross-domain models in real-world settings. However, it assumes knowledge of the attack family and leaves open questions on universal robustness to agnostic attacks.

Abstract

Unsupervised domain adaptation (UDA) aims to transfer knowledge from a label-rich source domain to an unlabeled target domain by addressing domain shifts. Most UDA approaches emphasize transfer ability, but often overlook robustness against adversarial attacks. Although vanilla adversarial training (VAT) improves the robustness of deep neural networks, it has little effect on UDA. This paper focuses on answering three key questions: 1) Why does VAT, known for its defensive effectiveness, fail in the UDA paradigm? 2) What is the generalization bound theory under attacks and how does it evolve from classical UDA theory? 3) How can we implement a robustification training procedure without complex modifications? Specifically, we explore and reveal the inherent entanglement challenge in general UDA+VAT paradigm, and propose an unsupervised robust domain adaptation (URDA) paradigm. We further derive the generalization bound theory of the URDA paradigm so that it can resist adversarial noise and domain shift. To the best of our knowledge, this is the first time to establish the URDA paradigm and theory. We further introduce a simple, novel yet effective URDA algorithm called Disentangled Adversarial Robustness Training (DART), a two-step training procedure that ensures both transferability and robustness. DART first pre-trains an arbitrary UDA model, and then applies an instantaneous robustification post-training step via disentangled distillation.Experiments on four benchmark datasets with/without attacks show that DART effectively enhances robustness while maintaining domain adaptability, and validate the URDA paradigm and theory.

Paper Structure

This paper contains 16 sections, 2 theorems, 16 equations, 9 figures, 9 tables, 1 algorithm.

Key Result

proposition thmcounterproposition

According to the triangle inequality for classification error $\epsilon$ben2006analysis, it implies that for any labeling function $f_{1}$, $f_{2}$, and $f_{3}$, $\epsilon\left(f_{1}, f_{2}\right)\! \leq \! \epsilon\left(f_{1}, f_{3}\right)+ \epsilon\left(f_{2}, f_{3}\right)$. For any hypothesis where $h(\cdot)$ represents the function that classifies the perturbed samples $\tilde{x}$ generate

Figures (9)

  • Figure 1: Schematic result of standard UDA paradigm, UDA+VAT paradigm, and the proposed URDA paradigm. (a) Standard UDA model cannot classify adversarial samples. (b) Robust UDA model with adversarial training can classify adversarial samples better but leads to misclassification of clean samples. (c) The proposed DART can classify both clean and adversarial samples well.
  • Figure 2: Pilot exploratory experiments on the robustness analysis of vanilla UDA models (Pseudo-labeling based methods: UPA chen2024uncertainty and PL-Mix kong2024unsupervised, traditional CDAN long2018conditional, MCC DBLP:journals/corr/abs-1912-03699 and MDD zhang2019bridging), effectiveness analysis of vanilla adversarial training (AT) for robust UDA, and superiority analysis of the proposed URDA paradigm (i.e., DART). The clean target accuracy and adversarial target accuracy (adv.) are reported on Office-31, Amazon Reviews, Office-Home and VisDA-2017 benchmarks. The vulnerability of UDA models is shown, while the proposed DART well improves their robustness.
  • Figure 3: Architecture for standard UDA non-robust paradigm, UDA+VAT robust paradigm and the proposed URDA robust paradigm, resp. (a) Traditional UDA model. (b) Robust UDA+vanilla adversarial training (VAT) by directly imposing consistency constraints between clean and adversarial target samples (entanglement). (c) The proposed URDA paradigm by disentangling transfer training from adversarial training, where Cons. means consistency constraint. In essence, URDA aims to robustify a pretrained UDA model by a robust model supervised with Cons. loss.
  • Figure 4: The pipeline of the proposed DART, derived from the proposed URDA theoretical bound under adversarial noises, is composed of two training steps: pre-training step and post-training step. In Step 1, a non-robust model $M_{uda}=E_{uda}\cup{C_{uda}}$ is pre-trained by an arbitrary UDA method, and frozen. In Step 2, we employ the pre-trained non-robust UDA model $M_{uda}$ to guide the on-the-fly robustification post-training of the robust model $M_{rob}=E_{rob}\cup{C_{rob}}$ via disentangled distillation.
  • Figure 5: The information flow direction varies when imposing constraints between clean and adversarial samples. (a) Direct constraints on $h(\tilde{x})$ and ${h(x)}$, which may damage transfer training and enforce the model to misclassify clean samples. (b) Indirect constraint on $h(\tilde{x})$ and ${h(x)}$ by introducing $h_t^*(x)$, which ensures independent execution of transfer and adversarial training. This approach enhances robustness through adversarial training while preserving exceptional classification performance on clean samples through transfer training.
  • ...and 4 more figures

Theorems & Definitions (6)

  • proposition thmcounterproposition
  • definition thmcounterdefinition
  • theorem 1
  • proof
  • remark thmcounterremark
  • remark thmcounterremark