Table of Contents
Fetching ...

CFL: On the Use of Characteristic Function Loss for Domain Alignment in Machine Learning

Abdullah Almansour, Ozan Tonguz

TL;DR

Distribution shift poses a major risk for ML deployment, especially in high-stakes settings. The paper introduces Characteristic Function Loss (CFL), a frequency-domain domain alignment method that uses the characteristic function $\phi_X(w)=\mathbb{E}[e^{j W^\top X}]$ (and its empirical form) to measure and minimize cross-domain differences, integrating it with standard ERM as $\ell_{total}=\ell_{ERM}+\lambda \ell_{CFL}$. CFL aligns domain embeddings by matching their CFs via $\ell_{CFL}=\frac{1}{N}\sum_{N=1}^{W} (\phi_{S,N}(W)-\phi_{T,N}(W))^2$, and experiments on the PACS dataset show reduced inter-domain gaps and improved generalization to unseen domains. Overall, the approach avoids high-dimensional PDF estimation and provides a principled, practical tool for domain adaptation and uncertainty assessment in deployment contexts.

Abstract

Machine Learning (ML) models are extensively used in various applications due to their significant advantages over traditional learning methods. However, the developed ML models often underperform when deployed in the real world due to the well-known distribution shift problem. This problem can lead to a catastrophic outcomes when these decision-making systems have to operate in high-risk applications. Many researchers have previously studied this problem in ML, known as distribution shift problem, using statistical techniques (such as Kullback-Leibler, Kolmogorov-Smirnov Test, Wasserstein distance, etc.) to quantify the distribution shift. In this letter, we show that using Characteristic Function (CF) as a frequency domain approach is a powerful alternative for measuring the distribution shift in high-dimensional space and for domain adaptation.

CFL: On the Use of Characteristic Function Loss for Domain Alignment in Machine Learning

TL;DR

Distribution shift poses a major risk for ML deployment, especially in high-stakes settings. The paper introduces Characteristic Function Loss (CFL), a frequency-domain domain alignment method that uses the characteristic function (and its empirical form) to measure and minimize cross-domain differences, integrating it with standard ERM as . CFL aligns domain embeddings by matching their CFs via , and experiments on the PACS dataset show reduced inter-domain gaps and improved generalization to unseen domains. Overall, the approach avoids high-dimensional PDF estimation and provides a principled, practical tool for domain adaptation and uncertainty assessment in deployment contexts.

Abstract

Machine Learning (ML) models are extensively used in various applications due to their significant advantages over traditional learning methods. However, the developed ML models often underperform when deployed in the real world due to the well-known distribution shift problem. This problem can lead to a catastrophic outcomes when these decision-making systems have to operate in high-risk applications. Many researchers have previously studied this problem in ML, known as distribution shift problem, using statistical techniques (such as Kullback-Leibler, Kolmogorov-Smirnov Test, Wasserstein distance, etc.) to quantify the distribution shift. In this letter, we show that using Characteristic Function (CF) as a frequency domain approach is a powerful alternative for measuring the distribution shift in high-dimensional space and for domain adaptation.

Paper Structure

This paper contains 7 sections, 6 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Samples from the Person class in (a) where (b) and (c) presents the Complex Plane showing domain gaps in the frequency domain and first two principal components of embeddings in the spatial domain, respectively. Observe how the gap or domain shift between different domains is visually depicted whereas it is difficult to visualize this in (c).
  • Figure 2: Samples from the Elephant class in (a) where (b) and (c) presents the Complex Plane of the distribution of domains for the backbone pretrained model and after the training using our approach, respectively. Using the CF approach can minimize such divergence between domains resulting in a model that performs well in the deployment environment with unforeseen domains.
  • Figure 3: Example which demonstrate different manifested distribution shift scenarios in the PACS pacs dataset for the object class Dog.