Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets

Christine W. Bang; Vanessa Didelez

Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets

Christine W. Bang, Vanessa Didelez

TL;DR

This paper extends constraint-based causal discovery to settings with latent variables and multiple overlapping datasets by introducing tiered background knowledge. It formalizes tiered ordering and presents two algorithms, tFCI and tIOD, with simple and full variants, and proves soundness (and completeness for the simple versions) under oracle conditions. The work shows that leveraging tiered knowledge can substantially improve identifiability, reduce computation, and yield more informative outputs, with practical relevance for multi-cohort and longitudinal studies. It also discusses robustness in finite samples and outlines directions for extending these ideas to other FCI variants and time-series contexts.

Abstract

In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the 'tiered FCI' (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the 'tiered IOD' (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.

Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets

TL;DR

Abstract

Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (40)