Table of Contents
Fetching ...

Phase Matching for Out-of-Distribution Generalization

Chengming Hu, Yeqian Du, Rui Wang, Hao Chen, Congcong Zhu

TL;DR

Phase Matching (PhaMa) addresses domain generalization by leveraging a frequency-domain decomposition of images, treating the amplitude spectrum as the domain-sensitive component and the phase spectrum as the structure-preserving, generalizable component. The method perturbes amplitudes during training and enforces phase-consistent representations via cross patch contrast using a momentum encoder and PatchNCE loss, integrated into a simple objective with classification and contrast terms. Empirical results on DG benchmarks (PACS, Digits-DG, Office-Home, VLCS) and robustness tests (CIFAR-10/100-C, GTA5→Cityscapes) show substantial improvements over baselines and competitive methods, supported by t-SNE and Grad-CAM analyses. A Fourier-based Structural Causal Model provides a conceptual framework linking Fourier components to DG, highlighting a semi-causal phase and non-causal amplitude in cross-domain generalization and suggesting directions for future work in latent-space frequency representations.

Abstract

The Fourier transform, an explicit decomposition method for visual signals, has been employed to explain the out-of-distribution generalization behaviors of Deep Neural Networks (DNNs). Previous studies indicate that the amplitude spectrum is susceptible to the disturbance caused by distribution shifts, whereas the phase spectrum preserves highly-structured spatial information that is crucial for robust visual representation learning. Inspired by this insight, this paper is dedicated to clarifying the relationships between Domain Generalization (DG) and the frequency components. Specifically, we provide distribution analysis and empirical experiments for the frequency components. Based on these observations, we propose a Phase Matching approach, termed PhaMa, to address DG problems. To this end, PhaMa introduces perturbations on the amplitude spectrum and establishes spatial relationships to match the phase components with patch contrastive learning. Experiments on multiple benchmarks demonstrate that our proposed method achieves state-of-the-art performance in domain generalization and out-of-distribution robustness tasks. Beyond vanilla analysis and experiments, we further clarify the relationships between the Fourier components and DG problems by introducing a Fourier-based Structural Causal Model (SCM).

Phase Matching for Out-of-Distribution Generalization

TL;DR

Phase Matching (PhaMa) addresses domain generalization by leveraging a frequency-domain decomposition of images, treating the amplitude spectrum as the domain-sensitive component and the phase spectrum as the structure-preserving, generalizable component. The method perturbes amplitudes during training and enforces phase-consistent representations via cross patch contrast using a momentum encoder and PatchNCE loss, integrated into a simple objective with classification and contrast terms. Empirical results on DG benchmarks (PACS, Digits-DG, Office-Home, VLCS) and robustness tests (CIFAR-10/100-C, GTA5→Cityscapes) show substantial improvements over baselines and competitive methods, supported by t-SNE and Grad-CAM analyses. A Fourier-based Structural Causal Model provides a conceptual framework linking Fourier components to DG, highlighting a semi-causal phase and non-causal amplitude in cross-domain generalization and suggesting directions for future work in latent-space frequency representations.

Abstract

The Fourier transform, an explicit decomposition method for visual signals, has been employed to explain the out-of-distribution generalization behaviors of Deep Neural Networks (DNNs). Previous studies indicate that the amplitude spectrum is susceptible to the disturbance caused by distribution shifts, whereas the phase spectrum preserves highly-structured spatial information that is crucial for robust visual representation learning. Inspired by this insight, this paper is dedicated to clarifying the relationships between Domain Generalization (DG) and the frequency components. Specifically, we provide distribution analysis and empirical experiments for the frequency components. Based on these observations, we propose a Phase Matching approach, termed PhaMa, to address DG problems. To this end, PhaMa introduces perturbations on the amplitude spectrum and establishes spatial relationships to match the phase components with patch contrastive learning. Experiments on multiple benchmarks demonstrate that our proposed method achieves state-of-the-art performance in domain generalization and out-of-distribution robustness tasks. Beyond vanilla analysis and experiments, we further clarify the relationships between the Fourier components and DG problems by introducing a Fourier-based Structural Causal Model (SCM).
Paper Structure (34 sections, 16 equations, 13 figures, 10 tables, 1 algorithm)

This paper contains 34 sections, 16 equations, 13 figures, 10 tables, 1 algorithm.

Figures (13)

  • Figure 1: (a) and (b) are the boxplots of the centroid frequency $F_{c}$ and frequency standard deviation $F_{std}$ of the amplitude spectra from PACS. (c) PACS datasetli2017deeper is a commonly used benchmark for domain generalization, comprising of four distinct domains: Art Painting, Cartoon, Photo, and Sketch.
  • Figure 2: Reconstructions for the Fourier components. We calculate the phase and amplitude spectrum for each image in Fig. \ref{['fig:pacs']}, and randomly initialize the phase (a) and amplitude (b) spectrum to a constant. (a) and (b) are reconstructed images with only amplitude information and phase information respectively.
  • Figure 3: Framework of PhaMa. The Fourier-augmented image pairs ($x_{o}$ and $x_{a}$) are both fed to the encoder and the momentum-updated encoder. Then, the features from the last two layers are sent to a 2-layer nonlinear projection head for extracting patch representations $p_{o(a)}^{i}$. The cross-contrastive gradients are backpropagated for updating the encoder and projection head (in green); for the momentums (in blue), parameters are updated using EMA.
  • Figure 4: Evaluation of different trade-off parameters.
  • Figure 5: t-SNE visualization on Baseline and PhaMa. We visualize the flattened last-layer embeddings of ResNet-18 on PACS with Art, Photo and Sketch as the source domains, and Cartoon as the target domain.
  • ...and 8 more figures