Table of Contents
Fetching ...

Fundamental Computational Limits in Pursuing Invariant Causal Prediction and Invariance-Guided Regularization

Yihong Gu, Cong Fang, Yang Xu, Zijian Guo, Jianqing Fan

TL;DR

This work tackles the fundamental computational limits of learning invariant causal predictors across heterogeneous environments, revealing that the decision problem ExistsLIS-2 is NP-hard even for the linear model with two environments, implying that no polynomial-time method can achieve fast estimation rates in the worst case if $\mathsf{P}\neq\mathsf{NP}$. To address this, the authors introduce invariance-guided regularization (IGR), a simple, computation-budgeted estimator with an ellipse-shaped uncertainty set that downweights spurious directions and yields a smooth interpolation between the most predictive solution and the causal solution via a tunable invariance parameter $\gamma$. They establish a distributionally robust interpretation of the method, provide non-asymptotic analysis, and demonstrate empirical efficacy on real data (stock returns and climate dynamics), showing improved worst-case robustness across environments. Overall, the paper clarifies a fundamental trade-off between computational tractability and causal identifiability in invariance pursuit and offers a practical, theoretically-grounded approach when mild structure is available.

Abstract

Pursuing invariant prediction from heterogeneous environments opens the door to learning causality in a purely data-driven way and has several applications in causal discovery and robust transfer learning. However, existing methods such as ICP [Peters et al., 2016] and EILLS [Fan et al., 2024] that can attain sample-efficient estimation are based on exponential time algorithms. In this paper, we show that such a problem is intrinsically hard in computation: the decision problem, testing whether a non-trivial prediction-invariant solution exists across two environments, is NP-hard even for the linear causal relationship. In the world where P$\neq$NP, our results imply that the estimation error rate can be arbitrarily slow using any computationally efficient algorithm. This suggests that pursuing causality is fundamentally harder than detecting associations when no prior assumption is pre-offered. Given there is almost no hope of computational improvement under the worst case, this paper proposes a method capable of attaining both computationally and statistically efficient estimation under additional conditions. Furthermore, our estimator is a distributionally robust estimator with an ellipse-shaped uncertain set where more uncertainty is placed on spurious directions than invariant directions, resulting in a smooth interpolation between the most predictive solution and the causal solution by varying the invariance hyper-parameter. Non-asymptotic results and empirical applications support the claim.

Fundamental Computational Limits in Pursuing Invariant Causal Prediction and Invariance-Guided Regularization

TL;DR

This work tackles the fundamental computational limits of learning invariant causal predictors across heterogeneous environments, revealing that the decision problem ExistsLIS-2 is NP-hard even for the linear model with two environments, implying that no polynomial-time method can achieve fast estimation rates in the worst case if . To address this, the authors introduce invariance-guided regularization (IGR), a simple, computation-budgeted estimator with an ellipse-shaped uncertainty set that downweights spurious directions and yields a smooth interpolation between the most predictive solution and the causal solution via a tunable invariance parameter . They establish a distributionally robust interpretation of the method, provide non-asymptotic analysis, and demonstrate empirical efficacy on real data (stock returns and climate dynamics), showing improved worst-case robustness across environments. Overall, the paper clarifies a fundamental trade-off between computational tractability and causal identifiability in invariance pursuit and offers a practical, theoretically-grounded approach when mild structure is available.

Abstract

Pursuing invariant prediction from heterogeneous environments opens the door to learning causality in a purely data-driven way and has several applications in causal discovery and robust transfer learning. However, existing methods such as ICP [Peters et al., 2016] and EILLS [Fan et al., 2024] that can attain sample-efficient estimation are based on exponential time algorithms. In this paper, we show that such a problem is intrinsically hard in computation: the decision problem, testing whether a non-trivial prediction-invariant solution exists across two environments, is NP-hard even for the linear causal relationship. In the world where PNP, our results imply that the estimation error rate can be arbitrarily slow using any computationally efficient algorithm. This suggests that pursuing causality is fundamentally harder than detecting associations when no prior assumption is pre-offered. Given there is almost no hope of computational improvement under the worst case, this paper proposes a method capable of attaining both computationally and statistically efficient estimation under additional conditions. Furthermore, our estimator is a distributionally robust estimator with an ellipse-shaped uncertain set where more uncertainty is placed on spurious directions than invariant directions, resulting in a smooth interpolation between the most predictive solution and the causal solution by varying the invariance hyper-parameter. Non-asymptotic results and empirical applications support the claim.

Paper Structure

This paper contains 55 sections, 30 theorems, 260 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Lemma 2.1

The problem 3Sat is NP-hard under deterministic polynomial-time reduction. The problem 3Sat-Unique is NP-hard under randomized polynomial-time reduction.

Figures (3)

  • Figure 1: (a) A structural causal model illustration of the multi-environment model in \ref{['ex2']}: the arrow from node $u$ to node $v$ with number $s$ means there is a linear causal effect $s$ of $u$ on $v$. (b) visualize the uncertainty set $\Theta_\gamma$ in three checkpoints of $\gamma\in \{0.4, 2, 3.6\}$ and regularization path of the proposed estimator \ref{['eq:minimax-k1']} in the three-dimensional parameter space $\beta \in \mathbb{R}^3$. For each $\gamma$, the uncertainty set $\Theta_\gamma$ is a two-dimensional plane filled by colors changing from red to blue as $\gamma$ increases. The upper panel of (c) depicts how the population level solution $\beta^\gamma \in \mathbb{R}^3$ changes according to $\gamma$ in each coordinate $j\in [3]$: the causal variable is represented by green solid line, and the two spurious (reverse causal) variable are represented by yellow dashed ($\beta_2$) and dotted ($\beta_3$) lines, respectively. The lower panel of (c) plots the counterpart for the FAIR-Linear estimator in gu2024causality.
  • Figure 2: The estimated coefficients of the selected variables are shown for $\mathcal{D}_1 \cup \mathcal{D}_2$ and $\mathcal{D}_6$. Warm colors represent positive coefficients, while cool colors indicate negative coefficients. Variables are denoted as $(\tau, j)$, where $\tau \in \{0, 1\}$ represents the time lag, and $j$ indicates the stock index.
  • Figure 3: The paths identified by our approach among the six regions (No. 20, 23, 38, 40, 48, and 49) in the air temperature task (air). The edge colors represent the path coefficients, while the labels indicate the time lags in days.

Theorems & Definitions (44)

  • Definition 1: Decision Problem
  • Example 2.1: An Instance of 3Sat Problem
  • Definition 2: Reduction
  • Definition 3: NP-hardness
  • Lemma 2.1
  • Definition 4: Invariant Set and Maximum Invariant Set
  • Example 2.2: An Instance of ExistLIS-Ident Problem
  • Example 2.3: An Instance of ExistLIS that is not ExistLIS-Ident
  • Theorem 2.1
  • Remark 1: NP-hardness under More Restrictive Conditions
  • ...and 34 more