The most likely common cause

A. Hovhannisyan; A. E. Allahverdyan

The most likely common cause

A. Hovhannisyan, A. E. Allahverdyan

TL;DR

The paper tackles causal insufficiency by assuming a latent common cause $C$ for observed variables $A$ and $B$ when only the joint distribution $p(a,b)$ is observed, leading to non-identifiability. It introduces a generalized likelihood $L_\beta$ with $0<\beta<1$ that connects to the maximum entropy principle to infer the most likely $p(a,c)$ and $p(b|c)$ consistent with CCP, while addressing identifiability via a fixed $|C|$ and linking to free-energy concepts. In binary setups, it reveals a phase-transition-like non-analytic change in the inferred cause as correlations shift, and it extends the analysis to three variables where latent causes can induce structures beyond DAG representations (TODAGs). The work also compares generalized likelihood to predictive likelihood and minimum common entropy, arguing that GL provides a more consistent and flexible framework for latent-confounder inference, with implications for causal modeling and potential extensions to higher dimensions and continuous domains.

Abstract

The common cause principle for two random variables $A$ and $B$ is examined in the case of causal insufficiency, when their common cause $C$ is known to exist, but only the joint probability of $A$ and $B$ is observed. As a result, $C$ cannot be uniquely identified (the latent confounder problem). We show that the generalized maximum likelihood method can be applied to this situation and allows identification of $C$ that is consistent with the common cause principle. It closely relates to the maximum entropy principle. Investigation of the two binary symmetric variables reveals a non-analytic behavior of conditional probabilities reminiscent of a second-order phase transition. This occurs during the transition from correlation to anti-correlation in the observed probability distribution. The relation between the generalized likelihood approach and alternative methods, such as predictive likelihood and the minimum common cause entropy, is discussed. The consideration of the common cause for three observed variables (and one hidden cause) uncovers causal structures that defy representation through directed acyclic graphs with the Markov condition.

The most likely common cause

TL;DR

The paper tackles causal insufficiency by assuming a latent common cause

for observed variables

and

when only the joint distribution

is observed, leading to non-identifiability. It introduces a generalized likelihood

with

that connects to the maximum entropy principle to infer the most likely

and

consistent with CCP, while addressing identifiability via a fixed

and linking to free-energy concepts. In binary setups, it reveals a phase-transition-like non-analytic change in the inferred cause as correlations shift, and it extends the analysis to three variables where latent causes can induce structures beyond DAG representations (TODAGs). The work also compares generalized likelihood to predictive likelihood and minimum common entropy, arguing that GL provides a more consistent and flexible framework for latent-confounder inference, with implications for causal modeling and potential extensions to higher dimensions and continuous domains.

Abstract

The common cause principle for two random variables

and

is examined in the case of causal insufficiency, when their common cause

is known to exist, but only the joint probability of

and

is observed. As a result,

cannot be uniquely identified (the latent confounder problem). We show that the generalized maximum likelihood method can be applied to this situation and allows identification of

that is consistent with the common cause principle. It closely relates to the maximum entropy principle. Investigation of the two binary symmetric variables reveals a non-analytic behavior of conditional probabilities reminiscent of a second-order phase transition. This occurs during the transition from correlation to anti-correlation in the observed probability distribution. The relation between the generalized likelihood approach and alternative methods, such as predictive likelihood and the minimum common cause entropy, is discussed. The consideration of the common cause for three observed variables (and one hidden cause) uncovers causal structures that defy representation through directed acyclic graphs with the Markov condition.

Paper Structure (27 sections, 68 equations, 3 figures)

This paper contains 27 sections, 68 equations, 3 figures.

Introduction
Generalized likelihood
Features of generalized likelihood
Application to the common cause principle
Consistency with the common cause principle
Relations with the maximum entropy method
Most likely minimal cause for symmetric binary variables
Phase transition between correlated and anti-correlated situations
Causation between events
Most likely common cause for three variables
Definition of the problem
Two extreme cases of initial 3-variable correlations
DAGs and TODAGs
Predictive likelihood, minimum entropy, and sparse common cause
Predictive (maximum aposteriori) likelihood
...and 12 more sections

Figures (3)

Figure 1: For three random variables $A=\{a, \bar{a}\}$, $B=\{b, \bar{b}\}$ and $C=\{c,\bar{c}\}$, this figure presents conditional probabilities in (\ref{['co1']}, \ref{['co2']}) for a concrete numerical example $p(a,\bar{b})=p(\bar{a},b)=0.175$, where $0\leq p(\bar{a},\bar{b}) \leq 0.65$. Thus $A$ and $B$ are correlated for $0.0511< p(\bar{a},\bar{b}) <0.5989$ and are anti-correlated for $0< p(\bar{a},\bar{b}) <0.0511$ and $0.5989< p(\bar{a},\bar{b})<0.65$; cf. (\ref{['co1']}, \ref{['co2']}). Curves that are depicted close are in reality coinciding. The curve for $p(c)$ is seen to be non-monotonic with one global maximum and one global minimum. The maximization of $L_{\beta\lesssim 1}$ in (\ref{['beta1']}, \ref{['bvana']}) was carried out with $\beta = 0.95$ under additional condition $p(b|c)>p(b| \bar{c})$ that secures a continuous transition between correlated and anti-correlated situations. Several quantities are non-analytic along this transition, e.g. $p(b|c)$, $p(b| \bar{c})$, and $p(c)$. In particular, $p(c)=1/2$ in the anti-correlated situation. The maximization with $\beta =0.99$ produces the same results.
Figure 2: The left Causal Markov Conditioned Directed Acyclic Graph (DAG) denotes the correlation structure in (\ref{['guenter']}). The right DAG denotes the correlation structure implied by the most likely cause $C$ of $A=(A_1,A_2)$ and $B$; see (\ref{['uto']}).
Figure 3: The left Causal Markov Conditioned Directed Acyclic Graph (DAG) denotes the correlation structure in (\ref{['mah']}). The right DAG denotes the correlation structure implied by the most likely cause $C$ of $A=(A_1,A_2)$ and $B$.

The most likely common cause

TL;DR

Abstract

The most likely common cause

Authors

TL;DR

Abstract

Table of Contents

Figures (3)