Table of Contents
Fetching ...

Invariant Learning via Probability of Sufficient and Necessary Causes

Mengyue Yang, Zhen Fang, Yonggang Zhang, Yali Du, Furui Liu, Jean-Francois Ton, Jianhong Wang, Jun Wang

TL;DR

This work tackles out-of-distribution generalization by moving beyond invariant representations to capture the essential sufficiency and necessity of causal features. It introduces PNS risk as a principled objective to quantify the likelihood that a representation encodes both necessary and sufficient causes for the label, and it derives identifiability and generalization bounds under exogeneity and monotonicity. The authors propose CaSN, a learning framework that minimizes a worst-case PNS risk with semantic separability constraints while enforcing exogeneity through domain-appropriate penalties. Empirical results on synthetic and real-world benchmarks demonstrate improved OOD performance and more interpretable, causally meaningful representations, including strong generalization on PACS, VLCS, Colored MNIST, and SpuCo datasets. Overall, the approach provides a theoretically grounded and practically effective pathway for robust causal representation learning in the wild.

Abstract

Out-of-distribution (OOD) generalization is indispensable for learning models in the wild, where testing distribution typically unknown and different from the training. Recent methods derived from causality have shown great potential in achieving OOD generalization. However, existing methods mainly focus on the invariance property of causes, while largely overlooking the property of \textit{sufficiency} and \textit{necessity} conditions. Namely, a necessary but insufficient cause (feature) is invariant to distribution shift, yet it may not have required accuracy. By contrast, a sufficient yet unnecessary cause (feature) tends to fit specific data well but may have a risk of adapting to a new domain. To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause. To associate PNS with OOD generalization, we propose PNS risk and formulate an algorithm to learn representation with a high PNS value. We theoretically analyze and prove the generalizability of the PNS risk. Experiments on both synthetic and real-world benchmarks demonstrate the effectiveness of the proposed method. The details of the implementation can be found at the GitHub repository: https://github.com/ymy4323460/CaSN.

Invariant Learning via Probability of Sufficient and Necessary Causes

TL;DR

This work tackles out-of-distribution generalization by moving beyond invariant representations to capture the essential sufficiency and necessity of causal features. It introduces PNS risk as a principled objective to quantify the likelihood that a representation encodes both necessary and sufficient causes for the label, and it derives identifiability and generalization bounds under exogeneity and monotonicity. The authors propose CaSN, a learning framework that minimizes a worst-case PNS risk with semantic separability constraints while enforcing exogeneity through domain-appropriate penalties. Empirical results on synthetic and real-world benchmarks demonstrate improved OOD performance and more interpretable, causally meaningful representations, including strong generalization on PACS, VLCS, Colored MNIST, and SpuCo datasets. Overall, the approach provides a theoretically grounded and practically effective pathway for robust causal representation learning in the wild.

Abstract

Out-of-distribution (OOD) generalization is indispensable for learning models in the wild, where testing distribution typically unknown and different from the training. Recent methods derived from causality have shown great potential in achieving OOD generalization. However, existing methods mainly focus on the invariance property of causes, while largely overlooking the property of \textit{sufficiency} and \textit{necessity} conditions. Namely, a necessary but insufficient cause (feature) is invariant to distribution shift, yet it may not have required accuracy. By contrast, a sufficient yet unnecessary cause (feature) tends to fit specific data well but may have a risk of adapting to a new domain. To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause. To associate PNS with OOD generalization, we propose PNS risk and formulate an algorithm to learn representation with a high PNS value. We theoretically analyze and prove the generalizability of the PNS risk. Experiments on both synthetic and real-world benchmarks demonstrate the effectiveness of the proposed method. The details of the implementation can be found at the GitHub repository: https://github.com/ymy4323460/CaSN.
Paper Structure (38 sections, 7 theorems, 54 equations, 3 figures, 4 tables)

This paper contains 38 sections, 7 theorems, 54 equations, 3 figures, 4 tables.

Key Result

Lemma 2.4

If $\mathbf{C}$ is exogenous relative to $Y$, and $Y$ is monotonic relative to $\mathbf{C}$, then

Figures (3)

  • Figure 1: (a) Examples for causal sufficiency and necessity in the cat classification. (b) The causal graph for OOD generalization problem. The arrows denote the causal generative direction and the dashed line connects the spurious correlated variables. Notations are formally defined in Section \ref{['sec:notations']}.
  • Figure 2: The synthetic results for validating the property of learned representation under different spurious degrees in data, $s=0.1$ for (a) and $s=0.7$ for (b), the x-axis shows different causal information y-axis shows the choice of $\delta$. (c) The results of the feature identification when $s=0.7$.
  • Figure 3: Example for causal sufficiency and necessity in image classification problem. The images on the left are for training and the rights are for OOD tests.

Theorems & Definitions (12)

  • Definition 2.1: Probability of Necessary and Sufficient (PNS) pearl2009causality
  • Definition 2.2: Exogeneity pearl2009causality
  • Definition 2.3: Monotonicity pearl2009causality
  • Lemma 2.4: pearl2009causality
  • Proposition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 4.3
  • Theorem A.1
  • Definition E.1: Sufficient Statistic fisher1922mathematicalDBLP:journals/tcs/ShamirST10
  • ...and 2 more