Table of Contents
Fetching ...

Detecting and Measuring Confounding Using Causal Mechanism Shifts

Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

Abstract

Detecting and measuring confounding effects from data is a key challenge in causal inference. Existing methods frequently assume causal sufficiency, disregarding the presence of unobserved confounding variables. Causal sufficiency is both unrealistic and empirically untestable. Additionally, existing methods make strong parametric assumptions about the underlying causal generative process to guarantee the identifiability of confounding variables. Relaxing the causal sufficiency and parametric assumptions and leveraging recent advancements in causal discovery and confounding analysis with non-i.i.d. data, we propose a comprehensive approach for detecting and measuring confounding. We consider various definitions of confounding and introduce tailored methodologies to achieve three objectives: (i) detecting and measuring confounding among a set of variables, (ii) separating observed and unobserved confounding effects, and (iii) understanding the relative strengths of confounding bias between different sets of variables. We present useful properties of a confounding measure and present measures that satisfy those properties. Empirical results support the theoretical analysis.

Detecting and Measuring Confounding Using Causal Mechanism Shifts

Abstract

Detecting and measuring confounding effects from data is a key challenge in causal inference. Existing methods frequently assume causal sufficiency, disregarding the presence of unobserved confounding variables. Causal sufficiency is both unrealistic and empirically untestable. Additionally, existing methods make strong parametric assumptions about the underlying causal generative process to guarantee the identifiability of confounding variables. Relaxing the causal sufficiency and parametric assumptions and leveraging recent advancements in causal discovery and confounding analysis with non-i.i.d. data, we propose a comprehensive approach for detecting and measuring confounding. We consider various definitions of confounding and introduce tailored methodologies to achieve three objectives: (i) detecting and measuring confounding among a set of variables, (ii) separating observed and unobserved confounding effects, and (iii) understanding the relative strengths of confounding bias between different sets of variables. We present useful properties of a confounding measure and present measures that satisfy those properties. Empirical results support the theoretical analysis.
Paper Structure (12 sections, 16 theorems, 11 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 16 theorems, 11 equations, 4 figures, 4 tables, 1 algorithm.

Key Result

Proposition 4.1

(Identifiability of $\mathbb{P}(X_j|do(X_i))$)$\mathbb{P}(X_j|do(X_i))$ is identifiable from the set of contexts $\mathbf{C}_{\{i\} \wedge \neg P_{ij}}$. To detect and measure confounding between a pair of nodes $X_i,X_j$, it is enough to observe two sets of contexts $\mathbf{C}_{\{i\} \wedge \neg P

Figures (4)

  • Figure 1: Setting 1: When contexts $\mathbf{C}_{\{i\} \wedge \neg P_{ij}}$ and $\mathbf{C}_{\{j\} \wedge \neg P_{ji}}$ are known where $P_{ij}$ is the set of node indices that belong to a path from $X_i$ to $X_j$ including $j$, we leverage directed information from $X_i$ to $X_j$ and from $X_j$ to $X_i$ to define a measure of confounding (Defn. \ref{['def confounding1']}). Setting 2: Causal mechanism changes in $Z$ introduces dependencies on the observed distributions of $X_i, X_j$. We leverage such dependencies to measure confounding when contexts $\mathbf{C}_{\{i\}\wedge \{j\}}$ are known (Defn. \ref{['def confounding2']}). Setting 3: If we know that there is a causal path from $X_i$ to $X_j$, we leverage dependencies between the pairs $(X_i, X_j)$ and $(Z, X_j)$ to measure confounding. Similarly, if we know that there is a causal path from $X_j$ to $X_i$, we leverage dependencies between the pairs $(X_i, X_j)$ and $(Z, X_i)$ to measure confounding (Defn. \ref{['def confounding3']}). Dashed arrows from $Z$ indicate that $Z$ is unobserved.
  • Figure 2: Measure of confounding between a pair of variables $X_i, X_j$. Our measures output zero when there is no confounding between $X_i,X_j$ and output positive values when $X_i,X_j$ are confounded.
  • Figure 3: Left: Conditioning on one of $\emptyset, Z_1, Z_2$ will not remove confounding between $X_i, X_j$ in $\mathcal{G}_5$. Hence $CNF-2$ returns positive values. Right: In $\mathcal{G}_6$, conditioning on $\emptyset$ does not remove the confounding effect of $Z$ on $X_i, X_j$. Hence, we observe a positive value for $CNF-2(X_i,X_j|\emptyset)$. Conditioning on $Z$ will block the confounding between $X_i, X_j$. Hence $CNF-2$ is closer to zero.
  • Figure 4: Two real-world examples where our method can be applied. Here Pro: Production Volume, Exp: Exports, Lab: Total Labor Required, Edu: Education, Wag: Wages, Inv: Investments. We can perform interventions on the above variables and any combination thereof to obtain context-specific data. We can use such data to identify and measure confounding by applying our methods.

Theorems & Definitions (31)

  • Definition 4.1
  • Definition 4.2
  • Definition 4.3
  • Definition 4.4
  • Proposition 4.1
  • Definition 4.5
  • Theorem 4.1
  • Theorem 4.2
  • Proposition 4.2
  • Definition 4.6
  • ...and 21 more