Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization
Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang
TL;DR
The paper tackles Out-Of-Distribution generalization under context imbalance by arguing that context is also invariant to class, enabling a shift away from relying on context annotations. It introduces IRMCon, which learns a context representation through an intra-class contrastive loss $L_{ct}$ within an IRM framework, yielding a context extractor $\phi_t$ that aligns with $\mathbf{x}_t$; this context is then used in an IPW reweighting scheme to form a context-balanced classifier (IRMCon-IPW). Empirical evaluation across context-biased datasets (Colored MNIST, Corrupted CIFAR-10, BAR) and domain-gap datasets (PACS) demonstrates state-of-the-art OOD performance on context bias and competitive results on domain generalization, with a non-pretraining protocol to avoid leakage. Theoretical justification is provided in the appendix, and the work emphasizes practical deployment without requiring context labels.
Abstract
Out-Of-Distribution generalization (OOD) is all about learning invariance against environmental changes. If the context in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context. However, collecting such a balanced dataset is impractical. Learning on imbalanced data makes the model bias to context and thus hurts OOD. Therefore, the key to OOD is context balance. We argue that the widely adopted assumption in prior work, the context bias can be directly annotated or estimated from biased class prediction, renders the context incomplete or even incorrect. In contrast, we point out the everoverlooked other side of the above principle: context is also invariant to class, which motivates us to consider the classes (which are already labeled) as the varying environments to resolve context bias (without context labels). We implement this idea by minimizing the contrastive loss of intra-class sample similarity while assuring this similarity to be invariant across all classes. On benchmarks with various context biases and domain gaps, we show that a simple re-weighting based classifier equipped with our context estimation achieves state-of-the-art performance. We provide the theoretical justifications in Appendix and codes on https://github.com/simpleshinobu/IRMCon.
