Invariant Representation via Decoupling Style and Spurious Features from Images
Ruimeng Li, Yuanhao Pu, Zhaoyi Li, Hong Xie, Defu Lian
TL;DR
This work tackles OOD generalization when domain labels are missing, identifying style distribution shift and spurious features as two distinct sources of distribution change. It proposes a Structural Causal Model and a framework called IRSS that decouples these factors using adversarial style alignment and multi-environment optimization, without extra supervision. By combining an adversarial style alignment stage with IRM-style penalties across discovered environments and an entropy term, IRSS improves invariant feature extraction and outperforms strong baselines on PACS, OfficeHome, and NICO. The approach provides a practical path toward robust OOD generalization without domain annotations and includes comprehensive ablations and interpretability analyses to validate component contributions.
Abstract
This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features in the absence of domain labels. To address this challenge, we first propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features. The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving OOD generalization. Moreover, it does not require additional supervision (e.g., domain labels) other than the images and their corresponding labels. Experiments on benchmark datasets demonstrate that IRSS outperforms traditional OOD methods and solves the problem of Invariant risk minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.
