Table of Contents
Fetching ...

Memory-Assisted Sub-Prototype Mining for Universal Domain Adaptation

Yuxiang Lai, Yi Zhou, Xinghong Liu, Tao Zhou

TL;DR

The paper tackles universal domain adaptation with target-private classes and notable intra-class variation. It introduces Memory-Assisted Sub-Prototype Mining (MemSPM), a memory-based framework that learns sub-prototypes within each category to enable finer-grained alignment across domains. By combining a fixed CLIP-based encoder (producing $Z$) with a memory-driven retrieval to form a task-oriented embedding $\hat{Z}$, along with cycle-consistent alignment and a reconstruction decoder for interpretability, MemSPM achieves strong performance on UniDA, OSDA, and PDA benchmarks. The work demonstrates that modeling intra-class structure and providing interpretable memory visualizations can substantially reduce negative transfer and improve transferability in heterogeneous domain settings.

Abstract

Universal domain adaptation aims to align the classes and reduce the feature gap between the same category of the source and target domains. The target private category is set as the unknown class during the adaptation process, as it is not included in the source domain. However, most existing methods overlook the intra-class structure within a category, especially in cases where there exists significant concept shift between the samples belonging to the same category. When samples with large concept shift are forced to be pushed together, it may negatively affect the adaptation performance. Moreover, from the interpretability aspect, it is unreasonable to align visual features with significant differences, such as fighter jets and civil aircraft, into the same category. Unfortunately, due to such semantic ambiguity and annotation cost, categories are not always classified in detail, making it difficult for the model to perform precise adaptation. To address these issues, we propose a novel Memory-Assisted Sub-Prototype Mining (MemSPM) method that can learn the differences between samples belonging to the same category and mine sub-classes when there exists significant concept shift between them. By doing so, our model learns a more reasonable feature space that enhances the transferability and reflects the inherent differences among samples annotated as the same category. We evaluate the effectiveness of our MemSPM method over multiple scenarios, including UniDA, OSDA, and PDA. Our method achieves state-of-the-art performance on four benchmarks in most cases.

Memory-Assisted Sub-Prototype Mining for Universal Domain Adaptation

TL;DR

The paper tackles universal domain adaptation with target-private classes and notable intra-class variation. It introduces Memory-Assisted Sub-Prototype Mining (MemSPM), a memory-based framework that learns sub-prototypes within each category to enable finer-grained alignment across domains. By combining a fixed CLIP-based encoder (producing ) with a memory-driven retrieval to form a task-oriented embedding , along with cycle-consistent alignment and a reconstruction decoder for interpretability, MemSPM achieves strong performance on UniDA, OSDA, and PDA benchmarks. The work demonstrates that modeling intra-class structure and providing interpretable memory visualizations can substantially reduce negative transfer and improve transferability in heterogeneous domain settings.

Abstract

Universal domain adaptation aims to align the classes and reduce the feature gap between the same category of the source and target domains. The target private category is set as the unknown class during the adaptation process, as it is not included in the source domain. However, most existing methods overlook the intra-class structure within a category, especially in cases where there exists significant concept shift between the samples belonging to the same category. When samples with large concept shift are forced to be pushed together, it may negatively affect the adaptation performance. Moreover, from the interpretability aspect, it is unreasonable to align visual features with significant differences, such as fighter jets and civil aircraft, into the same category. Unfortunately, due to such semantic ambiguity and annotation cost, categories are not always classified in detail, making it difficult for the model to perform precise adaptation. To address these issues, we propose a novel Memory-Assisted Sub-Prototype Mining (MemSPM) method that can learn the differences between samples belonging to the same category and mine sub-classes when there exists significant concept shift between them. By doing so, our model learns a more reasonable feature space that enhances the transferability and reflects the inherent differences among samples annotated as the same category. We evaluate the effectiveness of our MemSPM method over multiple scenarios, including UniDA, OSDA, and PDA. Our method achieves state-of-the-art performance on four benchmarks in most cases.
Paper Structure (24 sections, 9 equations, 5 figures, 6 tables)

This paper contains 24 sections, 9 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Illustration of our motivation. (a) Examples of concept shift and intra-class diversity in DA benchmarks. For the class of alarm clocks, we find that digital clocks, pointer clocks, and alarm bells should be set in different sub-classes. For the class of airplane, we find that images containing more than one plane, single jetliner, and turboprop aircraft should be differently treated for adaptation. (b) Previous methods utilize one-hot labels to guide classifying without considering the intra-class distinction. Consequently, the model forces all samples from the same class to converge towards a single center, disregarding the diversity in the class. Our method clusters samples with large intra-class differences into separate sub-classes, providing a more accurate representation. (c) During domain adaptation by our design, the samples in the target domain can also be aligned near the sub-class centers with similar features rather than just the class centers determined by labels.
  • Figure 2: Our model first utilizes a fixed pre-trained model as the encoder to extract input-oriented embedding given an input sample. The extracted input-oriented embedding is then compared with sub-prototypes learned in memory to find the closest $K$. These $K$ are then weighted-averaged into a task-oriented embedding to represent the input, and used for learning downstream tasks. During the UniDA process, we adopt the cycle-consistent matching method on the task-oriented embedding $\hat{Z}$ generated from the memory. Moreover, a decoder is designed to reconstruct the image, allowing for visualizing the sub-prototypes in memory and verifying the effectiveness of sub-class learning.
  • Figure 3: (a) The tSNE visualization shows the feature space of the sub-classes belonging to each category, which demonstrates the MemSPM mining the sub-prototypes successfully. (b) The results of different values of $S$ and $N$. (c) The reconstruction visualization shows what has been learned in the memory, which demonstrates the intra-class diversity has been learned by MemSPM. (d) The visualization of varying $K$ shows that insufficient values hinder the learning of appearance features.
  • Figure 4: The reconstruction visualization shows what has been learned in the memory, which demonstrates the intra-class diversity has been learned by MemSPM.
  • Figure 5: The tSNE visualization shows the distribution of the retrieved sub-prototypes and demonstrates that the sub-classes have been learned by MemSPM.