Table of Contents
Fetching ...

Improving Domain Adaptation Through Class Aware Frequency Transformation

Vikash Kumar, Himanshu Patil, Rohit Lal, Anirban Chakraborty

TL;DR

This work tackles unsupervised domain adaptation under large domain gaps by introducing CAFT++, a class-aware frequency transformation framework. It combines a light-weight, non-parametric frequency-domain style transfer (class-conditioned low-frequency swapping) with a robust pseudo-label filtering mechanism based on the absolute difference of top-2 predictions and a two-component Gaussian mixture model. By training with both original and frequency-transformed sources and using only high-confidence pseudo labels for fine-tuning, CAFT++ consistently improves performance across Office-31, Office-Home, and VisDA benchmarks when plugged into diverse UDA methods. The approach is computationally efficient, eliminates reliance on generative adversarial networks, and demonstrates effectiveness under standard closed-set as well as label-shift scenarios, yielding practical gains for real-world domain adaptation tasks.

Abstract

In this work, we explore the usage of the Frequency Transformation for reducing the domain shift between the source and target domain (e.g., synthetic image and real image respectively) towards solving the Domain Adaptation task. Most of the Unsupervised Domain Adaptation (UDA) algorithms focus on reducing the global domain shift between labelled source and unlabelled target domains by matching the marginal distributions under a small domain gap assumption. UDA performance degrades for the cases where the domain gap between source and target distribution is large. In order to bring the source and the target domains closer, we propose a novel approach based on traditional image processing technique Class Aware Frequency Transformation (CAFT) that utilizes pseudo label based class consistent low-frequency swapping for improving the overall performance of the existing UDA algorithms. The proposed approach, when compared with the state-of-the-art deep learning based methods, is computationally more efficient and can easily be plugged into any existing UDA algorithm to improve its performance. Additionally, we introduce a novel approach based on absolute difference of top-2 class prediction probabilities (ADT2P) for filtering target pseudo labels into clean and noisy sets. Samples with clean pseudo labels can be used to improve the performance of unsupervised learning algorithms. We name the overall framework as CAFT++. We evaluate the same on the top of different UDA algorithms across many public domain adaptation datasets. Our extensive experiments indicate that CAFT++ is able to achieve significant performance gains across all the popular benchmarks.

Improving Domain Adaptation Through Class Aware Frequency Transformation

TL;DR

This work tackles unsupervised domain adaptation under large domain gaps by introducing CAFT++, a class-aware frequency transformation framework. It combines a light-weight, non-parametric frequency-domain style transfer (class-conditioned low-frequency swapping) with a robust pseudo-label filtering mechanism based on the absolute difference of top-2 predictions and a two-component Gaussian mixture model. By training with both original and frequency-transformed sources and using only high-confidence pseudo labels for fine-tuning, CAFT++ consistently improves performance across Office-31, Office-Home, and VisDA benchmarks when plugged into diverse UDA methods. The approach is computationally efficient, eliminates reliance on generative adversarial networks, and demonstrates effectiveness under standard closed-set as well as label-shift scenarios, yielding practical gains for real-world domain adaptation tasks.

Abstract

In this work, we explore the usage of the Frequency Transformation for reducing the domain shift between the source and target domain (e.g., synthetic image and real image respectively) towards solving the Domain Adaptation task. Most of the Unsupervised Domain Adaptation (UDA) algorithms focus on reducing the global domain shift between labelled source and unlabelled target domains by matching the marginal distributions under a small domain gap assumption. UDA performance degrades for the cases where the domain gap between source and target distribution is large. In order to bring the source and the target domains closer, we propose a novel approach based on traditional image processing technique Class Aware Frequency Transformation (CAFT) that utilizes pseudo label based class consistent low-frequency swapping for improving the overall performance of the existing UDA algorithms. The proposed approach, when compared with the state-of-the-art deep learning based methods, is computationally more efficient and can easily be plugged into any existing UDA algorithm to improve its performance. Additionally, we introduce a novel approach based on absolute difference of top-2 class prediction probabilities (ADT2P) for filtering target pseudo labels into clean and noisy sets. Samples with clean pseudo labels can be used to improve the performance of unsupervised learning algorithms. We name the overall framework as CAFT++. We evaluate the same on the top of different UDA algorithms across many public domain adaptation datasets. Our extensive experiments indicate that CAFT++ is able to achieve significant performance gains across all the popular benchmarks.
Paper Structure (24 sections, 8 equations, 20 figures, 9 tables)

This paper contains 24 sections, 8 equations, 20 figures, 9 tables.

Figures (20)

  • Figure 1: Method Overview. (Top Row) represents general Unsupervised Domain Adaptation (UDA) setting with labelled source images and unlabelled target images. UDA algorithms try to reduce the domain gap. (Bottom Row) Our proposed method explicitly tries to swap the source image style with that of the target image using class aware frequency transform towards improved transferability.
  • Figure 2: Architecture of CAFT++. The architecture is divided into four stages. In Stage 1, we train a domain adaptation network using the labelled source domain dataset and unlabelled target domain dataset. This trained model is then used to generate pseudo labels in Stage 2. We pass the image through the network to get the difference in top-2 prediction probabilities. We further process this difference in probabilities through a GMM to decide whether it is clean or noisy. In Stage 3, we transform the source image using frequency domain manipulation with the help of the generated target pseudo labels. The transformed source is closer to the target. The source $\mathcal{S}$, Transformed source $\mathcal{\hat{S}}$ (with labels), and target $\mathcal{T}$ images (unlabelled) are then passed through the Domain Adaptation network in Stage 4.
  • Figure 3: Method of separating clean and non-noisy pseudo labels. To reduce the noise content, pseudo labels are filtered using absolute difference of top-2 probabilities (ADT2P). In this method we obtain the predicted probability distribution of an image by passing it through the classifier backbone. We then take the absolute difference $\tilde{x}$ of top-2 probabilities $\tilde{x}=\abs{p_1^m - p_2^m}$ and process it through the Gaussian Mixture Model. If the model predicts that the pseudo label is clean then it gets used for further training, else the sample is discarded.
  • Figure 4: Illustration of absolute difference of top-2 prediction probabilities. samples close to the decision boundary will have smaller prediction probability compared to samples that are far from the decision boundary and therefore the nearest class probability will also be significant. Hence, the absolute difference between top-2 prediction probabilities will be smaller for a sample near to the decision boundary and higher for the samples which are far from the decision boundary.
  • Figure 5: Histogram plot of Absolute difference of top 2 prediction probabilities for various datasets and splits. We plotted the histogram of absolute difference between top-2 prediction probabilities. We fit the two component Gaussian Mixture Model to separate correct and incorrect pseudo labels. Here, we have used the ground truth only for the representation purpose. We can see that most of the correct pseudo labels are having higher ADT2P value and lie toward the right of the histogram. Similarly, the ADT2P is smaller for the incorrect samples and lies towards the left side. So, if a sample's ADT2P belongs to the Gaussian with higher mean, it will be one of the samples with high confidence and vice-versa.
  • ...and 15 more figures