Table of Contents
Fetching ...

Invariant Causal Prediction for Nonlinear Models

Christina Heinze-Deml, Jonas Peters, Nicolai Meinshausen

TL;DR

This work extends Invariant Causal Prediction (ICP) to nonlinear, nonparametric settings by developing and evaluating several conditional independence tests that exploit invariance across environments. The core idea remains that the target Y’s dependence on its direct causes is invariant to interventions on other variables, and the authors introduce defining sets, confidence bands for causal effects, and predictive tools under interventions. Through comprehensive simulations, they show that an invariant residual distribution test, combined with pooled nonlinear modeling, delivers robust performance across diverse scenarios, while power is highly dependent on the true causal structure and the size of the parental set. A real-world fertility-rate analysis demonstrates the framework’s practical utility and reaffirms the central causal role of child mortality rates. The authors also provide software implementations (nonlinearICP and CondIndTests) to facilitate adoption in causal discovery and intervention planning.

Abstract

An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates.

Invariant Causal Prediction for Nonlinear Models

TL;DR

This work extends Invariant Causal Prediction (ICP) to nonlinear, nonparametric settings by developing and evaluating several conditional independence tests that exploit invariance across environments. The core idea remains that the target Y’s dependence on its direct causes is invariant to interventions on other variables, and the authors introduce defining sets, confidence bands for causal effects, and predictive tools under interventions. Through comprehensive simulations, they show that an invariant residual distribution test, combined with pooled nonlinear modeling, delivers robust performance across diverse scenarios, while power is highly dependent on the true causal structure and the size of the parental set. A real-world fertility-rate analysis demonstrates the framework’s practical utility and reaffirms the central causal role of child mortality rates. The authors also provide software implementations (nonlinearICP and CondIndTests) to facilitate adoption in causal discovery and intervention planning.

Abstract

An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates.

Paper Structure

This paper contains 63 sections, 27 equations, 14 figures, 2 tables, 6 algorithms.

Figures (14)

  • Figure 1: Three candidates for a causal DAG with target total fertility rate (TFR) and four potential causal predictor variables. We would like to infer the parents of TFR in the true causal graph. We use the continent as the environment variable $E$. If the true DAG was one of the two graphs on the left, the environmental variable would have no direct influence on the target variable TFR and 'Continent' would be a valid environmental variable, see Definition \ref{['def:env']}.
  • Figure 2: Data for Nigeria in 1993: The union of the confidence bands $\hat{\mathcal{F}}_S$, denoted by $\hat{\mathcal{F}}$, bounds the average causal effect of varying the variables in the defining set $\hat{D}_1 = \lbrace \text{IMR, Q5} \rbrace$ on the target $\log(\text{TFR})$. IMR and Q5 have been varied individually, see panels \ref{['fig:def_sets_ind1']} and \ref{['fig:def_sets_ind2']}, as well as jointly, see panel \ref{['fig:def_sets_jointly']}, over their respective quantiles. In panels \ref{['fig:def_sets_ind1']} and \ref{['fig:def_sets_ind2']}, we do not observe consistent effects different from zero as some of the accepted models do not contain IMR and some do not contain Q5. However, when varying the variables $\hat{D}_1 =\lbrace \text{IMR, Q5} \rbrace$ jointly (see panel \ref{['fig:def_sets_jointly']}), we see that all accepted models predict an increase in expected $\log(\text{TFR})$ as IMR and Q5 increase.
  • Figure 3: \ref{['fig:ace']} Bounds for the average causal effect of setting the variables IMR and Q5 in the African countries in 2013 to European levels, that is $\tilde{x}$ differs from the country-specific observed values $x$ in that the child mortality rates $\text{IMR}$ and $\text{Q5}$ have been set to their respective European average. The implied coverage guarantee is 80% as we chose $\alpha=0.1$. \ref{['fig:rf']} Random Forest regression model using all covariates as input. The (non-causal) regression effect coverage is again set to 80%. We will argue below that the confidence intervals obtained by random forest are too small, see Table \ref{['tab:datasets']} and Figure \ref{['fig:cv_cov_all']}.
  • Figure 4: The confidence intervals show the predicted change in $\log{(\text{TFR})}$ from 1973 -- 2008 for all African countries when not using their data in the nonlinear ICP estimation (using invariant conditional quantile prediction with $\alpha = 0.1$). In other words, only the data from the remaining five continents was used during training. The horizontal line segments mark the union over the accepted models' intervals for the predicted change; the blue squares show the true change in $\log{(\text{TFR})}$ from 1973 -- 2008. Only those countries are displayed for which the response was not missing in the data, i.e. where $\log{(\text{TFR})}$ in 1973 and in 2008 were recorded. The coverage is $25/26 \approx 0.96$.
  • Figure 5: The confidence intervals show the predicted change in $\log{(\text{TFR})}$ from 1973 -- 2008 for all countries when not using the data of the country's continent in the estimation (with implied coverage guarantee 80%). Only those countries are displayed for which $\log{(\text{TFR})}$ in 1973 and 2008 were not missing in the data. For nonlinear ICP the shown intervals are the union over the accepted models' intervals.
  • ...and 9 more figures

Theorems & Definitions (6)

  • Definition 1: Environmental variables
  • Example 1
  • Example 2: Linear model and nonlinear data
  • Example 3: Linear model and nonlinear data
  • Example 4
  • Example 5