Table of Contents
Fetching ...

Towards Automated Causal Discovery: a case study on 5G telecommunication data

Konstantina Biza, Antonios Ntroumpogiannis, Sofia Triantafillou, Ioannis Tsamardinos

TL;DR

AutoCD introduces a fully automated causal discovery and reasoning framework composed of Automated Feature Selection, Causal Learning, and Causal Reasoning and Visualization to handle high-dimensional, mixed-type temporal data. It optimizes data representation and algorithm hyperparameters via CASH and OCT, estimates the Markov boundary $\f Mb$ to build causal graphs, and uses bootstrapping to provide edge-confidence measures and intuitive Cytoscape visualizations. The authors demonstrate AutoCD on a real 5G telecommunication dataset and through resimulated and synthetic experiments, showing favorable precision, robust edge-confidence estimation, and interpretable outputs for non-expert users. This work offers a practical, scalable platform for automated causal discovery with broad applicability to complex networked systems and time-series data.

Abstract

We introduce the concept of Automated Causal Discovery (AutoCD), defined as any system that aims to fully automate the application of causal discovery and causal reasoning methods. AutoCD's goal is to deliver all causal information that an expert human analyst would and answer a user's causal queries. We describe the architecture of such a platform, and illustrate its performance on synthetic data sets. As a case study, we apply it on temporal telecommunication data. The system is general and can be applied to a plethora of causal discovery problems.

Towards Automated Causal Discovery: a case study on 5G telecommunication data

TL;DR

AutoCD introduces a fully automated causal discovery and reasoning framework composed of Automated Feature Selection, Causal Learning, and Causal Reasoning and Visualization to handle high-dimensional, mixed-type temporal data. It optimizes data representation and algorithm hyperparameters via CASH and OCT, estimates the Markov boundary to build causal graphs, and uses bootstrapping to provide edge-confidence measures and intuitive Cytoscape visualizations. The authors demonstrate AutoCD on a real 5G telecommunication dataset and through resimulated and synthetic experiments, showing favorable precision, robust edge-confidence estimation, and interpretable outputs for non-expert users. This work offers a practical, scalable platform for automated causal discovery with broad applicability to complex networked systems and time-series data.

Abstract

We introduce the concept of Automated Causal Discovery (AutoCD), defined as any system that aims to fully automate the application of causal discovery and causal reasoning methods. AutoCD's goal is to deliver all causal information that an expert human analyst would and answer a user's causal queries. We describe the architecture of such a platform, and illustrate its performance on synthetic data sets. As a case study, we apply it on temporal telecommunication data. The system is general and can be applied to a plethora of causal discovery problems.
Paper Structure (25 sections, 8 figures, 1 table)

This paper contains 25 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: The proposed architecture for Automated Causal Discovery.
  • Figure 2: The estimated causal structure for the telecommunication problem using the AFS, CL and CRV modules of AutoCD.
  • Figure 3: The precision and recall of the estimated Markov boundary and the difference in predictive performance on resimulated data over increasing sample size.
  • Figure 4: The edge adjacency precision and recall on the selected estimated graph and the difference in SHD on resimulated data over increasing sample size.
  • Figure 5: The AUC of the edge consistency frequency on resimulated data over increasing sample size.
  • ...and 3 more figures