Table of Contents
Fetching ...

Expert-Aided Causal Discovery of Ancestral Graphs

Tiago da Silva, Bruna Bazaluk, Eliezer de Souza da Silva, António Góis, Dominik Heider, Samuel Kaski, Diego Mesquita, Adèle Helena Ribeiro

TL;DR

Causal discovery under latent confounding is challenged by limited data and a lack of uncertainty quantification. The authors propose Ancestral GFlowNets (AGFN), a probabilistic method that samples ancestral graphs from a score-based belief $p(\mathcal{G}) \propto R(\mathcal{G})$, with $R(\mathcal{G})=\exp\{(U(\mathcal{G})-\mu)/\sigma\}$, and integrates an active, uncertain expert-in-the-loop to refine the inference without retraining. The framework combines Bayesian-style updating of edge-ancestry beliefs with a learned forward policy, using an acquisition function based on cross-entropy to efficiently query experts (including GPT-4o) and improve SHD/BIC metrics on synthetic data and the Sachs dataset. This approach yields uncertainty-aware CD that remains competitive with state-of-the-art methods while enabling robust incorporation of imperfect expert knowledge, offering practical impact for real-world causal analysis under hidden confounding.

Abstract

Causal discovery (CD) algorithms are notably brittle when data is scarce, inferring unreliable causal relations that may contradict expert knowledge, especially when considering latent confounders. Furthermore, the lack of uncertainty quantification in most CD methods hinders users from diagnosing and refining results. To address these issues, we introduce Ancestral GFlowNets (AGFNs). AGFN samples ancestral graphs (AGs) proportionally to a score-based belief distribution representing our epistemic uncertainty over the causal relationships. Building upon this distribution, we propose an elicitation framework for expert-driven assessment. This framework comprises an optimal experimental design to probe the expert and a scheme to incorporate the obtained feedback into AGFN. Our experiments show that: i) AGFN is competitive against other methods that address latent confounding on both synthetic and real-world datasets; and ii) our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.

Expert-Aided Causal Discovery of Ancestral Graphs

TL;DR

Causal discovery under latent confounding is challenged by limited data and a lack of uncertainty quantification. The authors propose Ancestral GFlowNets (AGFN), a probabilistic method that samples ancestral graphs from a score-based belief , with , and integrates an active, uncertain expert-in-the-loop to refine the inference without retraining. The framework combines Bayesian-style updating of edge-ancestry beliefs with a learned forward policy, using an acquisition function based on cross-entropy to efficiently query experts (including GPT-4o) and improve SHD/BIC metrics on synthetic data and the Sachs dataset. This approach yields uncertainty-aware CD that remains competitive with state-of-the-art methods while enabling robust incorporation of imperfect expert knowledge, offering practical impact for real-world causal analysis under hidden confounding.

Abstract

Causal discovery (CD) algorithms are notably brittle when data is scarce, inferring unreliable causal relations that may contradict expert knowledge, especially when considering latent confounders. Furthermore, the lack of uncertainty quantification in most CD methods hinders users from diagnosing and refining results. To address these issues, we introduce Ancestral GFlowNets (AGFNs). AGFN samples ancestral graphs (AGs) proportionally to a score-based belief distribution representing our epistemic uncertainty over the causal relationships. Building upon this distribution, we propose an elicitation framework for expert-driven assessment. This framework comprises an optimal experimental design to probe the expert and a scheme to incorporate the obtained feedback into AGFN. Our experiments show that: i) AGFN is competitive against other methods that address latent confounding on both synthetic and real-world datasets; and ii) our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.
Paper Structure (58 sections, 32 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 58 sections, 32 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: EITL probabilistic CD. We iteratively refine the trained AGFN by i) querying (Q) experts on the relationship of a highly informative pair of variables and ii) updating beliefs based on their (noisy) answers (A). Histograms show marginals over edge types (green denotes ground truth). Notably, our belief increasingly concentrates on the true AG, $1 \rightarrow 2 \leftrightarrow 3$.
  • Figure 2: Generative process of AGs$\{\mathcal{G}_5, \mathcal{G}_6, \mathcal{G}_7\}$ using GFlowNets. Starting from an empty graph $\mathcal{G}_0$, edges between variables $\{X_1, X_2, X_3\}$ are added following the action-policy $\pi_F$. Solid edges indicate trajectories leading to sampled graphs. Dashed lines denote non-realized transitions to the terminal state $\square$.
  • Figure 3: Sampling quality. AFGN accurately samples from its underlying score-based beliefs. The 1st plot (left) shows that the marginal probabilities over edges induced by AGFN match the analytical marginal. The 2nd shows the same for the probability of directed paths between two variables. The rightmost plots show that the distributions of the SHD and BIC from AGFN samples closely match the analytical one.
  • Figure 4: Human-aided AGFN outperforms CD baselines after a single feedback in the considered datasets. Each plot shows the expected SHD under a varying number of feedbacks over $30$ EITL simulations.
  • Figure 5: GPT-aided AGFN outperforms CD baselines in the considered datasets. Each plot shows the expected SHD under a varying number of feedbacks over $3$ EITL simulations.
  • ...and 7 more figures