Table of Contents
Fetching ...

On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data

Shunxing Fan, Mingming Gong, Kun Zhang

TL;DR

It is shown theoretically and experimentally that causal discovery results may be seriously distorted by aggregation especially in complete nonlinear case and also found causal relationship still recoverable from aggregated data if the authors have partial linearity or appropriate prior.

Abstract

We consider the effect of temporal aggregation on instantaneous (non-temporal) causal discovery in general setting. This is motivated by the observation that the true causal time lag is often considerably shorter than the observational interval. This discrepancy leads to high aggregation, causing time-delay causality to vanish and instantaneous dependence to manifest. Although we expect such instantaneous dependence has consistency with the true causal relation in certain sense to make the discovery results meaningful, it remains unclear what type of consistency we need and when will such consistency be satisfied. We proposed functional consistency and conditional independence consistency in formal way correspond functional causal model-based methods and conditional independence-based methods respectively and provide the conditions under which these consistencies will hold. We show theoretically and experimentally that causal discovery results may be seriously distorted by aggregation especially in complete nonlinear case and we also find causal relationship still recoverable from aggregated data if we have partial linearity or appropriate prior. Our findings suggest community should take a cautious and meticulous approach when interpreting causal discovery results from such data and show why and when aggregation will distort the performance of causal discovery methods.

On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data

TL;DR

It is shown theoretically and experimentally that causal discovery results may be seriously distorted by aggregation especially in complete nonlinear case and also found causal relationship still recoverable from aggregated data if the authors have partial linearity or appropriate prior.

Abstract

We consider the effect of temporal aggregation on instantaneous (non-temporal) causal discovery in general setting. This is motivated by the observation that the true causal time lag is often considerably shorter than the observational interval. This discrepancy leads to high aggregation, causing time-delay causality to vanish and instantaneous dependence to manifest. Although we expect such instantaneous dependence has consistency with the true causal relation in certain sense to make the discovery results meaningful, it remains unclear what type of consistency we need and when will such consistency be satisfied. We proposed functional consistency and conditional independence consistency in formal way correspond functional causal model-based methods and conditional independence-based methods respectively and provide the conditions under which these consistencies will hold. We show theoretically and experimentally that causal discovery results may be seriously distorted by aggregation especially in complete nonlinear case and we also find causal relationship still recoverable from aggregated data if we have partial linearity or appropriate prior. Our findings suggest community should take a cautious and meticulous approach when interpreting causal discovery results from such data and show why and when aggregation will distort the performance of causal discovery methods.
Paper Structure (27 sections, 13 theorems, 51 equations, 11 figures, 3 tables)

This paper contains 27 sections, 13 theorems, 51 equations, 11 figures, 3 tables.

Key Result

Theorem 3.3

If such $\hat{f}$, as defined in Definition def:FunctionalConsistencyRegardingAdditiveNoise, exists, then $\hat{f}$ must take the form: where $c$ is any constant (which can be incorporated into the noise term) and the expression $\mathbb{E}(\cdot \mid \overline{X}=T)$ denotes the conditional expectation. For simplicity, we set $c=0$. Consequently, this implies:

Figures (11)

  • Figure 1: Left: Directed acyclic graph for the VAR model with chain-like cross lag effects. Right: The corresponding summary graph.
  • Figure 2: Left: Chain-like aligned model. Center: Fork-like aligned model. Right: Collider-like aligned model.
  • Figure 3: Relationship between the aggregation factor $k$ and the performance of the Direct LiNGAM method. The blue area represents the standard deviation. The red line represents the random guess baseline.
  • Figure 4: Causal graph of original data
  • Figure 5: Linear Case: Accuracy of Causal Discovery Method
  • ...and 6 more figures

Theorems & Definitions (27)

  • Definition 2.1: Summary Graph
  • Definition 3.1: Bivariate Aligned Model with Instant Structures
  • Definition 3.2: Functional Consistency Regarding Additive Noise
  • Theorem 3.3: Construction of $\hat{f}$
  • Theorem 3.4: Necessary and Sufficient Condition
  • Definition 3.5: Functional Consistency with respect to different regions
  • Theorem 3.6: General Case for Functional Consistency Across Different Regions
  • Definition 4.1: Trivariant Aligned Model with Instant Structures
  • Definition 4.2: Conditional Independence Consistency
  • Remark 4.3: Conditional Independence Consistency under Faithfulness Condition
  • ...and 17 more