Table of Contents
Fetching ...

Gradually Vanishing Gap in Prototypical Network for Unsupervised Domain Adaptation

Shanshan Wang, Hao Zhou, Xun Yang, Zhenwei He, Mengzhu Wang, Xingyi Zhang, Meng Wang

TL;DR

This work tackles unsupervised domain adaptation under large domain gaps that can cause distribution collapse during alignment. It introduces Gradually Vanishing Gap in Prototypical Network (GVG-PN), which constructs two intermediate, domain-biased domains via a GCN to preserve global distribution structure while maintaining local manifold relations; prototypes are formed by aggregating cross-domain features. A novel ProNCE loss focuses on hard negative prototype pairs to enhance discriminability, complemented by InfoNCE preliminaries and a mutual information term. Theoretical analysis connects target risk to source risk via α-divergence, arguing that progressively aligning intermediate domains tightens bounds on target loss. Empirically, GVG-PN establishes new state-of-the-art results across five benchmarks (Office-31, ImageCLEF-DA, Office-Home, VisDA-2017, DomainNet), validating the effectiveness of domain-biased prototypes and prototype-level contrastive learning for robust, scalable DA in vision tasks.

Abstract

Unsupervised domain adaptation (UDA) is a critical problem for transfer learning, which aims to transfer the semantic information from labeled source domain to unlabeled target domain. Recent advancements in UDA models have demonstrated significant generalization capabilities on the target domain. However, the generalization boundary of UDA models remains unclear. When the domain discrepancy is too large, the model can not preserve the distribution structure, leading to distribution collapse during the alignment. To address this challenge, we propose an efficient UDA framework named Gradually Vanishing Gap in Prototypical Network (GVG-PN), which achieves transfer learning from both global and local perspectives. From the global alignment standpoint, our model generates a domain-biased intermediate domain that helps preserve the distribution structures. By entangling cross-domain features, our model progressively reduces the risk of distribution collapse. However, only relying on global alignment is insufficient to preserve the distribution structure. To further enhance the inner relationships of features, we introduce the local perspective. We utilize the graph convolutional network (GCN) as an intuitive method to explore the internal relationships between features, ensuring the preservation of manifold structures and generating domain-biased prototypes. Additionally, we consider the discriminability of the inner relationships between features. We propose a pro-contrastive loss to enhance the discriminability at the prototype level by separating hard negative pairs. By incorporating both GCN and the pro-contrastive loss, our model fully explores fine-grained semantic relationships. Experiments on several UDA benchmarks validated that the proposed GVG-PN can clearly outperform the SOTA models.

Gradually Vanishing Gap in Prototypical Network for Unsupervised Domain Adaptation

TL;DR

This work tackles unsupervised domain adaptation under large domain gaps that can cause distribution collapse during alignment. It introduces Gradually Vanishing Gap in Prototypical Network (GVG-PN), which constructs two intermediate, domain-biased domains via a GCN to preserve global distribution structure while maintaining local manifold relations; prototypes are formed by aggregating cross-domain features. A novel ProNCE loss focuses on hard negative prototype pairs to enhance discriminability, complemented by InfoNCE preliminaries and a mutual information term. Theoretical analysis connects target risk to source risk via α-divergence, arguing that progressively aligning intermediate domains tightens bounds on target loss. Empirically, GVG-PN establishes new state-of-the-art results across five benchmarks (Office-31, ImageCLEF-DA, Office-Home, VisDA-2017, DomainNet), validating the effectiveness of domain-biased prototypes and prototype-level contrastive learning for robust, scalable DA in vision tasks.

Abstract

Unsupervised domain adaptation (UDA) is a critical problem for transfer learning, which aims to transfer the semantic information from labeled source domain to unlabeled target domain. Recent advancements in UDA models have demonstrated significant generalization capabilities on the target domain. However, the generalization boundary of UDA models remains unclear. When the domain discrepancy is too large, the model can not preserve the distribution structure, leading to distribution collapse during the alignment. To address this challenge, we propose an efficient UDA framework named Gradually Vanishing Gap in Prototypical Network (GVG-PN), which achieves transfer learning from both global and local perspectives. From the global alignment standpoint, our model generates a domain-biased intermediate domain that helps preserve the distribution structures. By entangling cross-domain features, our model progressively reduces the risk of distribution collapse. However, only relying on global alignment is insufficient to preserve the distribution structure. To further enhance the inner relationships of features, we introduce the local perspective. We utilize the graph convolutional network (GCN) as an intuitive method to explore the internal relationships between features, ensuring the preservation of manifold structures and generating domain-biased prototypes. Additionally, we consider the discriminability of the inner relationships between features. We propose a pro-contrastive loss to enhance the discriminability at the prototype level by separating hard negative pairs. By incorporating both GCN and the pro-contrastive loss, our model fully explores fine-grained semantic relationships. Experiments on several UDA benchmarks validated that the proposed GVG-PN can clearly outperform the SOTA models.
Paper Structure (39 sections, 1 theorem, 23 equations, 8 figures, 8 tables, 1 algorithm)

This paper contains 39 sections, 1 theorem, 23 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

Proposition 1

If $\alpha^{\prime} \in(0,1]$, define $\alpha=1-\alpha^{\prime}$ and assume that the loss $(-\log \hat{p}(y \mid z))$ is bounded by $M$, $y \in \mathcal{Y}$, $z \in \mathcal{Z}$ , then the result is: where the loss of source domain is $l_{\text{source }}=\mathbb{E}_{x, y \sim p(x, y), z \sim p(z \mid x)}[-\log \hat{p}(y \mid z)]$ and the loss of target domain is $l_{\text{target }}=\mathbb{E}_{x,

Figures (8)

  • Figure 1: Motivation for the proposed approach. Previous DA methods that directly align two domains can not prevent the misclassification of target samples. In some cases, prototypes of certain categories may stay in incorrect category spaces. To overcome this issue, our proposed approach aims to generate two intermediate domains to achieve progressive alignment. By exploring both global and local distributions, we ensure fine-grained semantic relationships during the generation of intermediate domains. Prototypes are utilized to describe the semantic structure of these intermediate domains. The parameter 'w' represents the extent to which prototypes push apart, thereby enhancing the discriminative ability of hard alignment on categories. Consequently, our model is capable of aligning different distributions while maintaining the integrity of the distribution structure.
  • Figure 2: Motivation of the pro-contrastive learning. Although the transferability in DA could alleviate the domain discrepancy problem, it may not effectively address misclassification issues with hard samples. We aim to assign greater weight to these challenging hard class pairs. By doing so, the discriminability of the hard negative pairs is enhanced, leading to improved separation of features between classes such as 'lion' and 'tiger'.
  • Figure 3: An overview of our GVG-PN method is presented as follows. $\mathcal{F}$ signifies the feature extractor, $\mathcal{C}$ and $\mathcal{G}_C$ represent the classifier components, $\mathcal{G}_A$ denotes the affinity matrix generation layer, $\mathcal{G}_N$ denotes the node update layer, and $T$ corresponds to the ground-truth matrix. (a) In the feature aggregation phase, the ground-truth label guides $\mathcal{G}_{A}$ to generate the affinity matrix $A$. Subsequently, the node features are fed into $\mathcal{G}_{N}$ to obtain the aggregated features ${f}_{gcn}$. To generate domain-biased prototypes, we compute prototypes for each category based on the aggregated features. During the prototype generation process, both intra-class and inter-class relationships are taken into consideration. (b) We utilize the prototypes to explore the discriminability of classes. Our pro-contrastive learning approach aims to bring samples from the same class closer together and samples from different classes farther apart. Furthermore, we specifically focus on separating harder negative class pairs. As a progressive step, both domains are adapted in this process.
  • Figure 4: Discussion of various model analyses: (a) Quantitative distribution differences between domains measured using $\mathcal{A}$-distance following domain adaptation. (b) Accuracy comparison of GVG-PN and the variant of GVG-PN on three tasks on the Office31 dataset. (c) Convergence of test errors among different models. (d) During the training process, dynamic threshold adaptive changes are observed on tasks A $\rightarrow$ W and Rw $\rightarrow$ Ar.
  • Figure 5: Visualizing Embedding Features of Task $A \rightarrow W$ on the Office-31 dataset using the t-SNE Algorithm. Top: Visualizing Clustering of Source and Target Domain Features: (a) Non-adaptation, (b) MSTN, (c) GVG-PN (w/o $\mathcal{L}_{\text{ProNCE}}$), and (d) GVG-PN. Bottom: Domain Matching Visualization of (e) Non-adaptation, (f) MSTN, (g) GVG-PN (w/o $\mathcal{L}_{\text{ProNCE}}$), and (h) GVG-PN. Source domain Amazon (blue) and target domain Webcam (red).
  • ...and 3 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Remark 1
  • Remark 2
  • Remark 3