Non-stationary Domain Generalization: Theory and Algorithm
Thai-Hoang Pham, Xueru Zhang, Ping Zhang
TL;DR
The paper tackles domain generalization under non-stationary environments where domains evolve sequentially. It introduces Adaptive Invariant Representation Learning (AIRL), which learns a sequence of representations that are invariant between consecutive domains while adapting across the domain timeline, using Enc, Transformer, and LSTM components. Theoretical results provide upper bounds on target-domain error and justify learning a ground-truth non-stationary mechanism M together with a model sequence, while AIRL operationalizes this via two losses: invariant representation and predictive accuracy. Empirically, AIRL outperforms a broad set of baselines on synthetic and real datasets, and ablations demonstrate the necessity of each component. Overall, AIRL offers a principled, scalable approach to robust DG in non-stationary settings with practical implications for time-evolving data applications.
Abstract
Although recent advances in machine learning have shown its success to learn from independent and identically distributed (IID) data, it is vulnerable to out-of-distribution (OOD) data in an open world. Domain generalization (DG) deals with such an issue and it aims to learn a model from multiple source domains that can be generalized to unseen target domains. Existing studies on DG have largely focused on stationary settings with homogeneous source domains. However, in many applications, domains may evolve along a specific direction (e.g., time, space). Without accounting for such non-stationary patterns, models trained with existing methods may fail to generalize on OOD data. In this paper, we study domain generalization in non-stationary environment. We first examine the impact of environmental non-stationarity on model performance and establish the theoretical upper bounds for the model error at target domains. Then, we propose a novel algorithm based on adaptive invariant representation learning, which leverages the non-stationary pattern to train a model that attains good performance on target domains. Experiments on both synthetic and real data validate the proposed algorithm.
