Inference of Causal Networks using a Topological Threshold
Filipe Barroso, Diogo Gomes, Gareth J. Baxter
TL;DR
The paper addresses causal-network inference from data by removing ad-hoc thresholds through automatic topological criteria. It introduces Net Influence (NI), a state-wise, asymmetric measure, and a two-stage constraint-based algorithm that first determines a data-driven threshold via topological methods (Connected or Knee) and then prunes edges using conditioned CI tests. Across real and synthetic networks, NI-based Knee consistently achieves fast inference with strong edge-directionality, often surpassing the PC benchmark in both speed and accuracy, especially for larger networks. The work offers a scalable approach to discrete-data DAG discovery with principled thresholding and directionality, suitable for applications in engineering and science where large datasets and causal insight are essential.
Abstract
We propose a constraint-based algorithm, which automatically determines causal relevance thresholds, to infer causal networks from data. We call these topological thresholds. We present two methods for determining the threshold: the first seeks a set of edges that leaves no disconnected nodes in the network; the second seeks a causal large connected component in the data. We tested these methods both for discrete synthetic and real data, and compared the results with those obtained for the PC algorithm, which we took as the benchmark. We show that this novel algorithm is generally faster and more accurate than the PC algorithm. The algorithm for determining the thresholds requires choosing a measure of causality. We tested our methods for Fisher Correlations, commonly used in PC algorithm (for instance in \cite{kalisch2005}), and further proposed a discrete and asymmetric measure of causality, that we called Net Influence, which provided very good results when inferring causal networks from discrete data. This metric allows for inferring directionality of the edges in the process of applying the thresholds, speeding up the inference of causal DAGs.
