Relational Causal Discovery with Latent Confounders
Matteo Negro, Andrea Piras, Ragib Ahsan, David Arbour, Elena Zheleva
TL;DR
RelFCI addresses learning causal structure from relational data under latent confounding, extending causal discovery beyond i.i.d. assumptions. It introduces Latent Relational Causal Models (LRCMs) and lifted representations—MAAGG, PAAGG, and PARM—for modeling relational dependencies and their equivalence classes across perspectives, using a hop threshold $h$ (with $h' \ge 2h$ for latent paths). The RelFCI algorithm extends FCI and RCD to relational domains, proving soundness and completeness under ideal conditional independence testing and specified assumptions, and demonstrates robust performance on synthetic data with latent confounders. Collectively, the work advances relational causal discovery by incorporating latent confounding and providing a practical, scalable framework for causal effect estimation in complex relational systems.
Abstract
Estimating causal effects from real-world relational data can be challenging when the underlying causal model and potential confounders are unknown. While several causal discovery algorithms exist for learning causal models with latent confounders from data, they assume that the data is independent and identically distributed (i.i.d.) and are not well-suited for learning from relational data. Similarly, existing relational causal discovery algorithms assume causal sufficiency, which is unrealistic for many real-world datasets. To address this gap, we propose RelFCI, a sound and complete causal discovery algorithm for relational data with latent confounders. Our work builds upon the Fast Causal Inference (FCI) and Relational Causal Discovery (RCD) algorithms and it defines new graphical models, necessary to support causal discovery in relational domains. We also establish soundness and completeness guarantees for relational d-separation with latent confounders. We present experimental results demonstrating the effectiveness of RelFCI in identifying the correct causal structure in relational causal models with latent confounders.
