Table of Contents
Fetching ...

Casual Inference via Style Bias Deconfounding for Domain Generalization

Jiaxi Li, Di Lin, Hao Chen, Hongying Liu, Liang Wan, Wei Feng

TL;DR

This work tackles domain generalization by treating style as a confounder that biases predictions under distribution shift. It introduces Style Deconfounding Causal Learning (SDCL), which combines a structural causal model with a style-guided mixture of experts (SGEM) and a back-door causal learning module (BDCL) to perform causal interventions using AdaIN-inspired style transfers. The framework is designed to be compatible with existing data augmentation techniques and is validated across natural and medical image recognition tasks, achieving superior generalization in both single-domain and multi-domain settings. The results demonstrate that explicitly modeling and intervening on style confounding yields more robust, causal-feature-driven representations with practical benefits for real-world deployment. The approach also provides a concrete pathway for integrating causal inference into domain-generalization pipelines.

Abstract

Deep neural networks (DNNs) often struggle with out-of-distribution data, limiting their reliability in diverse realworld applications. To address this issue, domain generalization methods have been developed to learn domain-invariant features from single or multiple training domains, enabling generalization to unseen testing domains. However, existing approaches usually overlook the impact of style frequency within the training set. This oversight predisposes models to capture spurious visual correlations caused by style confounding factors, rather than learning truly causal representations, thereby undermining inference reliability. In this work, we introduce Style Deconfounding Causal Learning (SDCL), a novel causal inference-based framework designed to explicitly address style as a confounding factor. Our approaches begins with constructing a structural causal model (SCM) tailored to the domain generalization problem and applies a backdoor adjustment strategy to account for style influence. Building on this foundation, we design a style-guided expert module (SGEM) to adaptively clusters style distributions during training, capturing the global confounding style. Additionally, a back-door causal learning module (BDCL) performs causal interventions during feature extraction, ensuring fair integration of global confounding styles into sample predictions, effectively reducing style bias. The SDCL framework is highly versatile and can be seamlessly integrated with state-of-the-art data augmentation techniques. Extensive experiments across diverse natural and medical image recognition tasks validate its efficacy, demonstrating superior performance in both multi-domain and the more challenging single-domain generalization scenarios.

Casual Inference via Style Bias Deconfounding for Domain Generalization

TL;DR

This work tackles domain generalization by treating style as a confounder that biases predictions under distribution shift. It introduces Style Deconfounding Causal Learning (SDCL), which combines a structural causal model with a style-guided mixture of experts (SGEM) and a back-door causal learning module (BDCL) to perform causal interventions using AdaIN-inspired style transfers. The framework is designed to be compatible with existing data augmentation techniques and is validated across natural and medical image recognition tasks, achieving superior generalization in both single-domain and multi-domain settings. The results demonstrate that explicitly modeling and intervening on style confounding yields more robust, causal-feature-driven representations with practical benefits for real-world deployment. The approach also provides a concrete pathway for integrating causal inference into domain-generalization pipelines.

Abstract

Deep neural networks (DNNs) often struggle with out-of-distribution data, limiting their reliability in diverse realworld applications. To address this issue, domain generalization methods have been developed to learn domain-invariant features from single or multiple training domains, enabling generalization to unseen testing domains. However, existing approaches usually overlook the impact of style frequency within the training set. This oversight predisposes models to capture spurious visual correlations caused by style confounding factors, rather than learning truly causal representations, thereby undermining inference reliability. In this work, we introduce Style Deconfounding Causal Learning (SDCL), a novel causal inference-based framework designed to explicitly address style as a confounding factor. Our approaches begins with constructing a structural causal model (SCM) tailored to the domain generalization problem and applies a backdoor adjustment strategy to account for style influence. Building on this foundation, we design a style-guided expert module (SGEM) to adaptively clusters style distributions during training, capturing the global confounding style. Additionally, a back-door causal learning module (BDCL) performs causal interventions during feature extraction, ensuring fair integration of global confounding styles into sample predictions, effectively reducing style bias. The SDCL framework is highly versatile and can be seamlessly integrated with state-of-the-art data augmentation techniques. Extensive experiments across diverse natural and medical image recognition tasks validate its efficacy, demonstrating superior performance in both multi-domain and the more challenging single-domain generalization scenarios.

Paper Structure

This paper contains 23 sections, 12 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: A schematic before and after causal intervention. Before intervention: The model relies on frequently occurring style types (orange oval) to make predictions. After intervention: Different style features from the source domain (green, blue) are fairly incorporated into the prediction of the current sample (orange), enabling the model to consider global styles comprehensively, thus eliminating style bias.
  • Figure 2: The proposed SCM for DG and the causal inference process. Please refer to the text for detailed explanation.
  • Figure 3: Overview of SDCL. The original and augmented samples are input into SDCL, which first extracts style embeddings from shallow features and inputs them into the SGEM for adaptive expert allocation, constructing a confounded style set. Then, BDCL uses this confounded set to perform style transfer to facilitate sample interventions, obtaining causal features that are input into the subsequent network for classification or segmentation.
  • Figure 4: Visualization. (a) Grad-CAM on PACS dataset ("Photo" $\rightarrow$ “Art/Sketch”). (b) Semantic segmentation example on the unseen Cityscapes domain, using a model trained on the GTAV dataset. (c) Semantic segmentation examples on the unseen abdominal MR domain, using a model trained on abdominal CT images, and on the unseen cardiac domain B, using a model trained on cardiac domain A.
  • Figure 5: Hyper-parameter sensitivity analysis on PACS. $N$ denotes the number of experts, and $k$ is the the TopK experts selected.
  • ...and 3 more figures