Table of Contents
Fetching ...

Reinforced Domain Selection for Continuous Domain Adaptation

Hanbing Liu, Huaze Tang, Yanru Wu, Yang Li, Xiao-Ping Zhang

TL;DR

Continuous Domain Adaptation (CDA) often suffers from selecting effective intermediate domains without explicit metadata. This work introduces a reinforcement learning framework fused with feature disentanglement to discover transfer paths in an unsupervised manner, guided by a reward based on latent-domain distances. A dual-network architecture isolates domain-invariant and domain-specific features, trained with mutual information-based losses and a supervised objective on the invariant features, while a policy generator learns which intermediate domains to traverse. Empirical results on Rotated MNIST and ADNI show improvements in target accuracy and path efficiency over traditional CDA methods, demonstrating the practicality of dynamic domain-path learning for robust cross-domain adaptation.

Abstract

Continuous Domain Adaptation (CDA) effectively bridges significant domain shifts by progressively adapting from the source domain through intermediate domains to the target domain. However, selecting intermediate domains without explicit metadata remains a substantial challenge that has not been extensively explored in existing studies. To tackle this issue, we propose a novel framework that combines reinforcement learning with feature disentanglement to conduct domain path selection in an unsupervised CDA setting. Our approach introduces an innovative unsupervised reward mechanism that leverages the distances between latent domain embeddings to facilitate the identification of optimal transfer paths. Furthermore, by disentangling features, our method facilitates the calculation of unsupervised rewards using domain-specific features and promotes domain adaptation by aligning domain-invariant features. This integrated strategy is designed to simultaneously optimize transfer paths and target task performance, enhancing the effectiveness of domain adaptation processes. Extensive empirical evaluations on datasets such as Rotated MNIST and ADNI demonstrate substantial improvements in prediction accuracy and domain selection efficiency, establishing our method's superiority over traditional CDA approaches.

Reinforced Domain Selection for Continuous Domain Adaptation

TL;DR

Continuous Domain Adaptation (CDA) often suffers from selecting effective intermediate domains without explicit metadata. This work introduces a reinforcement learning framework fused with feature disentanglement to discover transfer paths in an unsupervised manner, guided by a reward based on latent-domain distances. A dual-network architecture isolates domain-invariant and domain-specific features, trained with mutual information-based losses and a supervised objective on the invariant features, while a policy generator learns which intermediate domains to traverse. Empirical results on Rotated MNIST and ADNI show improvements in target accuracy and path efficiency over traditional CDA methods, demonstrating the practicality of dynamic domain-path learning for robust cross-domain adaptation.

Abstract

Continuous Domain Adaptation (CDA) effectively bridges significant domain shifts by progressively adapting from the source domain through intermediate domains to the target domain. However, selecting intermediate domains without explicit metadata remains a substantial challenge that has not been extensively explored in existing studies. To tackle this issue, we propose a novel framework that combines reinforcement learning with feature disentanglement to conduct domain path selection in an unsupervised CDA setting. Our approach introduces an innovative unsupervised reward mechanism that leverages the distances between latent domain embeddings to facilitate the identification of optimal transfer paths. Furthermore, by disentangling features, our method facilitates the calculation of unsupervised rewards using domain-specific features and promotes domain adaptation by aligning domain-invariant features. This integrated strategy is designed to simultaneously optimize transfer paths and target task performance, enhancing the effectiveness of domain adaptation processes. Extensive empirical evaluations on datasets such as Rotated MNIST and ADNI demonstrate substantial improvements in prediction accuracy and domain selection efficiency, establishing our method's superiority over traditional CDA approaches.

Paper Structure

This paper contains 11 sections, 8 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview and framework of our method. (a) Overview of Continual Domain Adaptation using Reinforcement Learning. Our approach employs a policy generator to devise strategies for intermediate domains, thus establishing an optimal transfer path. (b) Framework of the Proposed Method. Input from the source, target, and intermediate domains is processed by a feature extractor to derive common features. Subsequently, a dual-network system isolates domain-invariant and domain-specific features. The domain-specific features from various domains are then evaluated based on their distances to calculate rewards, which assist the policy generator in formulating policies for each intermediate domain.
  • Figure 2: Reinforced Domain selection results. The y-axis of the left figure represents various intermediate domains, each identified by a specific rotation angle ranging from the source domain at 0 degrees to the target domain, which is the last intermediate domain incremented by an additional 18 degrees. The x-axis samples every five epochs. A color gradient from light to dark illustrates the order of domain selection throughout the experiment, with white denoting domains that were not selected. The blue and yellow curves in the right figure represent setups with four and five intermediate domains, respectively, consistent with the configurations shown in the left figure.
  • Figure 3: Visualizations of the domain-specific and domain-invariant features. This image was generated using the t-SNE sejourne2021large method and visualized by reducing the dimensions to three. Different colors represent different domains.