Table of Contents
Fetching ...

Non-stationary Domain Generalization: Theory and Algorithm

Thai-Hoang Pham, Xueru Zhang, Ping Zhang

TL;DR

The paper tackles domain generalization under non-stationary environments where domains evolve sequentially. It introduces Adaptive Invariant Representation Learning (AIRL), which learns a sequence of representations that are invariant between consecutive domains while adapting across the domain timeline, using Enc, Transformer, and LSTM components. Theoretical results provide upper bounds on target-domain error and justify learning a ground-truth non-stationary mechanism M together with a model sequence, while AIRL operationalizes this via two losses: invariant representation and predictive accuracy. Empirically, AIRL outperforms a broad set of baselines on synthetic and real datasets, and ablations demonstrate the necessity of each component. Overall, AIRL offers a principled, scalable approach to robust DG in non-stationary settings with practical implications for time-evolving data applications.

Abstract

Although recent advances in machine learning have shown its success to learn from independent and identically distributed (IID) data, it is vulnerable to out-of-distribution (OOD) data in an open world. Domain generalization (DG) deals with such an issue and it aims to learn a model from multiple source domains that can be generalized to unseen target domains. Existing studies on DG have largely focused on stationary settings with homogeneous source domains. However, in many applications, domains may evolve along a specific direction (e.g., time, space). Without accounting for such non-stationary patterns, models trained with existing methods may fail to generalize on OOD data. In this paper, we study domain generalization in non-stationary environment. We first examine the impact of environmental non-stationarity on model performance and establish the theoretical upper bounds for the model error at target domains. Then, we propose a novel algorithm based on adaptive invariant representation learning, which leverages the non-stationary pattern to train a model that attains good performance on target domains. Experiments on both synthetic and real data validate the proposed algorithm.

Non-stationary Domain Generalization: Theory and Algorithm

TL;DR

The paper tackles domain generalization under non-stationary environments where domains evolve sequentially. It introduces Adaptive Invariant Representation Learning (AIRL), which learns a sequence of representations that are invariant between consecutive domains while adapting across the domain timeline, using Enc, Transformer, and LSTM components. Theoretical results provide upper bounds on target-domain error and justify learning a ground-truth non-stationary mechanism M together with a model sequence, while AIRL operationalizes this via two losses: invariant representation and predictive accuracy. Empirically, AIRL outperforms a broad set of baselines on synthetic and real datasets, and ablations demonstrate the necessity of each component. Overall, AIRL offers a principled, scalable approach to robust DG in non-stationary settings with practical implications for time-evolving data applications.

Abstract

Although recent advances in machine learning have shown its success to learn from independent and identically distributed (IID) data, it is vulnerable to out-of-distribution (OOD) data in an open world. Domain generalization (DG) deals with such an issue and it aims to learn a model from multiple source domains that can be generalized to unseen target domains. Existing studies on DG have largely focused on stationary settings with homogeneous source domains. However, in many applications, domains may evolve along a specific direction (e.g., time, space). Without accounting for such non-stationary patterns, models trained with existing methods may fail to generalize on OOD data. In this paper, we study domain generalization in non-stationary environment. We first examine the impact of environmental non-stationarity on model performance and establish the theoretical upper bounds for the model error at target domains. Then, we propose a novel algorithm based on adaptive invariant representation learning, which leverages the non-stationary pattern to train a model that attains good performance on target domains. Experiments on both synthetic and real data validate the proposed algorithm.
Paper Structure (34 sections, 8 theorems, 36 equations, 8 figures, 8 tables)

This paper contains 34 sections, 8 theorems, 36 equations, 8 figures, 8 tables.

Key Result

Theorem 1

Given domain sequence $\{D_t\}_{t=1}^{T+K}$, dataset sequence $\{S_t\}_{t=1}^{T+K}$ sampled from $\{D_t\}_{t=1}^{T+K}$, for any $M \in \mathcal{M}$ ($M$ can depend on $\{S_t\}_{t=1}^{T+K}$) and any $0 < \delta < 1$, with probability at least $1 - \delta$ over the choice of dataset sequence $\{S_t\}_

Figures (8)

  • Figure 1: An illustrative comparison between conventional DG and DG in non-stationary environment: domains in conventional DG are independently sampled from a stationary environment, whereas DG in non-stationary environment considers domains that evolve along a specific direction. As shown in the right plot, data (i.e., images) changes over time and the model trained on past data may not have good performance on future data due to non-stationarity (i.e., temporal shift).
  • Figure 2: Visualization of learning non-stationary mapping between two domains $D^W_t$ (i.e., generated from $D_{t+1}$ by importance weighting) and $D_{t+1}$. (a) Learning in input space $\mathcal{X}$. (b) Learning in representation space $\mathcal{Z}$.
  • Figure 3: Overall architecture of AIRL and the visualization of its learning process.
  • Figure 4: Learning process for AIRL
  • Figure 5: Inference process for AIRL
  • ...and 3 more figures

Theorems & Definitions (12)

  • Definition 1: Non-stationary complexity
  • Theorem 1
  • Proposition 1
  • Remark 1: Representation learning
  • Remark 2: Comparison with conventional DG
  • Definition 2
  • Proposition 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • ...and 2 more