Gradual Domain Adaptation via Normalizing Flows
Shogo Sagawa, Hideitsu Hino
TL;DR
The paper tackles unsupervised domain adaptation under large source–target gaps where gradual self-training is ineffective due to few intermediate domains. It introduces a continuous normalizing flow framework that learns a continuous transformation from the target distribution toward a Gaussian mixture via the source domain, enabling gradual shifts without self-training. The approach combines CNFs with non-parametric log-likelihood estimation, a Gaussian mixture base, and a theoretical bound linking domain transport to target risk; experiments show improved performance and the ability to generate synthetic intermediate domains. This method offers a scalable, theoretically grounded alternative to self-training in challenging domain-shift settings, with practical implications for real-world tasks exhibiting substantial domain gaps.
Abstract
Standard domain adaptation methods do not work well when a large gap exists between the source and target domains. Gradual domain adaptation is one of the approaches used to address the problem. It involves leveraging the intermediate domain, which gradually shifts from the source domain to the target domain. In previous work, it is assumed that the number of intermediate domains is large and the distance between adjacent domains is small; hence, the gradual domain adaptation algorithm, involving self-training with unlabeled datasets, is applicable. In practice, however, gradual self-training will fail because the number of intermediate domains is limited and the distance between adjacent domains is large. We propose the use of normalizing flows to deal with this problem while maintaining the framework of unsupervised domain adaptation. The proposed method learns a transformation from the distribution of the target domain to the Gaussian mixture distribution via the source domain. We evaluate our proposed method by experiments using real-world datasets and confirm that it mitigates the above-explained problem and improves the classification performance.
