Local Causal Structure Learning in the Presence of Latent Variables
Feng Xie, Zheng Li, Peng Wu, Yan Zeng, Chunchen Liu, Zhi Geng
TL;DR
The paper tackles local causal structure learning when latent confounders are present, proposing the MMB-by-MMB algorithm to identify direct causes and effects of a target variable using only local structures. It establishes theoretical consistency under standard assumptions (causal Markov and Faithfulness) and demonstrates that local information can equal the global causal structure for the target under these conditions. The method relies on m-separation and V-structure reasoning within local Markov blankets and employs principled stop rules to ensure alignment with global learning. Empirical results on synthetic benchmarks and real gene expression data show that MMB-by-MMB outperforms global methods and other local approaches in accuracy and efficiency, especially in latent-variable settings. The work advances practical local causal discovery in complex systems, with potential extensions using background knowledge and interventional data to further improve identifiability.
Abstract
Discovering causal relationships from observational data, particularly in the presence of latent variables, poses a challenging problem. While current local structure learning methods have proven effective and efficient when the focus lies solely on the local relationships of a target variable, they operate under the assumption of causal sufficiency. This assumption implies that all the common causes of the measured variables are observed, leaving no room for latent variables. Such a premise can be easily violated in various real-world applications, resulting in inaccurate structures that may adversely impact downstream tasks. In light of this, our paper delves into the primary investigation of locally identifying potential parents and children of a target from observational data that may include latent variables. Specifically, we harness the causal information from m-separation and V-structures to derive theoretical consistency results, effectively bridging the gap between global and local structure learning. Together with the newly developed stop rules, we present a principled method for determining whether a variable is a direct cause or effect of a target. Further, we theoretically demonstrate the correctness of our approach under the standard causal Markov and faithfulness conditions, with infinite samples. Experimental results on both synthetic and real-world data validate the effectiveness and efficiency of our approach.
