Table of Contents
Fetching ...

A Survey on Causal Discovery Methods for I.I.D. and Time Series Data

Uzma Hasan, Emam Hossain, Md Osman Gani

TL;DR

This study presents an extensive discussion on the methods designed to perform causal discovery from both independent and identically distributed (I.I.D.) data and time series data and provides a comprehensive discussion of the algorithms designed to identify causal relations in different settings.

Abstract

The ability to understand causality from data is one of the major milestones of human-level intelligence. Causal Discovery (CD) algorithms can identify the cause-effect relationships among the variables of a system from related observational data with certain assumptions. Over the years, several methods have been developed primarily based on the statistical properties of data to uncover the underlying causal mechanism. In this study, we present an extensive discussion on the methods designed to perform causal discovery from both independent and identically distributed (I.I.D.) data and time series data. For this purpose, we first introduce the common terminologies used in causal discovery literature and then provide a comprehensive discussion of the algorithms designed to identify causal relations in different settings. We further discuss some of the benchmark datasets available for evaluating the algorithmic performance, off-the-shelf tools or software packages to perform causal discovery readily, and the common metrics used to evaluate these methods. We also evaluate some widely used causal discovery algorithms on multiple benchmark datasets and compare their performances. Finally, we conclude by discussing the research challenges and the applications of causal discovery algorithms in multiple areas of interest.

A Survey on Causal Discovery Methods for I.I.D. and Time Series Data

TL;DR

This study presents an extensive discussion on the methods designed to perform causal discovery from both independent and identically distributed (I.I.D.) data and time series data and provides a comprehensive discussion of the algorithms designed to identify causal relations in different settings.

Abstract

The ability to understand causality from data is one of the major milestones of human-level intelligence. Causal Discovery (CD) algorithms can identify the cause-effect relationships among the variables of a system from related observational data with certain assumptions. Over the years, several methods have been developed primarily based on the statistical properties of data to uncover the underlying causal mechanism. In this study, we present an extensive discussion on the methods designed to perform causal discovery from both independent and identically distributed (I.I.D.) data and time series data. For this purpose, we first introduce the common terminologies used in causal discovery literature and then provide a comprehensive discussion of the algorithms designed to identify causal relations in different settings. We further discuss some of the benchmark datasets available for evaluating the algorithmic performance, off-the-shelf tools or software packages to perform causal discovery readily, and the common metrics used to evaluate these methods. We also evaluate some widely used causal discovery algorithms on multiple benchmark datasets and compare their performances. Finally, we conclude by discussing the research challenges and the applications of causal discovery algorithms in multiple areas of interest.
Paper Structure (95 sections, 20 equations, 33 figures, 8 tables)

This paper contains 95 sections, 20 equations, 33 figures, 8 tables.

Figures (33)

  • Figure 1: Causal Discovery: Identification of a causal graph from data.
  • Figure 2: (a) Latent confounder $L$ causes both variables $S$ and $C$, and the association between $S$ and $C$ is denoted by $\textbf{?}$ which can be mistaken as causation. The graph in (b) is a causal graph depicting the causes and effects of cancer (CANCER).
  • Figure 3: (a) A graph $G$, (b) its skeleton graph $S_{G}$, (c) a mixed graph$M_{G}$ with directed & undirected edges.
  • Figure 4: Fundamental building blocks in causal graphical models.
  • Figure 5: Markov Equivalence in Chains and Fork.
  • ...and 28 more figures

Theorems & Definitions (12)

  • Definition 1: Chain
  • Definition 2: Fork
  • Definition 3: Collider/V-structure
  • Definition 4: d-separation
  • Definition 5: Markov Blanket
  • Definition 6: Structural Causal Model
  • Definition 7: Time Series Data
  • Definition 8: Instantaneous Causal Effect
  • Definition 9: Lagged Causal Effect
  • Definition 10: Full-time Causal Graph
  • ...and 2 more