Table of Contents
Fetching ...

Invariant Representation via Decoupling Style and Spurious Features from Images

Ruimeng Li, Yuanhao Pu, Zhaoyi Li, Hong Xie, Defu Lian

TL;DR

This work tackles OOD generalization when domain labels are missing, identifying style distribution shift and spurious features as two distinct sources of distribution change. It proposes a Structural Causal Model and a framework called IRSS that decouples these factors using adversarial style alignment and multi-environment optimization, without extra supervision. By combining an adversarial style alignment stage with IRM-style penalties across discovered environments and an entropy term, IRSS improves invariant feature extraction and outperforms strong baselines on PACS, OfficeHome, and NICO. The approach provides a practical path toward robust OOD generalization without domain annotations and includes comprehensive ablations and interpretability analyses to validate component contributions.

Abstract

This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features in the absence of domain labels. To address this challenge, we first propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features. The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving OOD generalization. Moreover, it does not require additional supervision (e.g., domain labels) other than the images and their corresponding labels. Experiments on benchmark datasets demonstrate that IRSS outperforms traditional OOD methods and solves the problem of Invariant risk minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.

Invariant Representation via Decoupling Style and Spurious Features from Images

TL;DR

This work tackles OOD generalization when domain labels are missing, identifying style distribution shift and spurious features as two distinct sources of distribution change. It proposes a Structural Causal Model and a framework called IRSS that decouples these factors using adversarial style alignment and multi-environment optimization, without extra supervision. By combining an adversarial style alignment stage with IRM-style penalties across discovered environments and an entropy term, IRSS improves invariant feature extraction and outperforms strong baselines on PACS, OfficeHome, and NICO. The approach provides a practical path toward robust OOD generalization without domain annotations and includes comprehensive ablations and interpretability analyses to validate component contributions.

Abstract

This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features in the absence of domain labels. To address this challenge, we first propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features. The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving OOD generalization. Moreover, it does not require additional supervision (e.g., domain labels) other than the images and their corresponding labels. Experiments on benchmark datasets demonstrate that IRSS outperforms traditional OOD methods and solves the problem of Invariant risk minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.
Paper Structure (25 sections, 11 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 11 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustrating OOD problems caused by style and spurious features using elephant-labeled images in the PACS dataset li2017deeper. There are two distinct OOD problems: (1) inherent distribution shift in domain-specific classification due to varying styles across different domains, and (2) distribution shift in spurious features across images with the same style, where non-target objects may differ and result in specific distribution shifts.
  • Figure 2: The proposed SCM for image generation, which serves as the causal-inspired assumption of IRSS. The causal features of the target object in the image are represented by $C_{cau}$. In contrast, the unrelated features are represented by $C_{spu}$. The combination of $C_{cau}$ and $C_{spu}$ forms the image's overall feature, denoted as content $Con$. $Con$ is then combined with the style feature $Sty$ to generate the final image $X$. Additionally, $C_{cau}$ determines the classification label $Y$.
  • Figure 3: IRSS degenerates into DANN and IRM
  • Figure 4: The proposed method framework IRSS consists of three main parts: (A) Adversarial Net Part, (B) Main Network, and (C) IRM Calculate Part.
  • Figure 5: Sensitivity Analysis on Environment Nums and Style Nums in the PACS and OfficeHome Datasets
  • ...and 1 more figures