Table of Contents
Fetching ...

Robust Causal Analysis of Linear Cyclic Systems With Hidden Confounders

Boris Lorbeer, Axel Küpper

TL;DR

This work targets causal analysis in linear systems with cycles and hidden confounders, where data contamination can distort causal inferences. It first analyzes the robustness of the LLC algorithm and then develops robust LLC variants by substituting standard covariance estimation with MCD and Gamma Divergence Estimation for covariance inferences. Theoretical results show that the original LLC has a breakdown point of zero, and empirical tests on synthetic data reveal that the GDE-based LLC provides the strongest robustness gains, often surpassing MCD under moderate contamination. The study also emphasizes reproducibility by providing open-source code to enable broader adoption in domains requiring cyclic, confounded causal modeling.

Abstract

We live in a world full of complex systems which we need to improve our understanding of. To accomplish this, purely probabilistic investigations are often not enough. They are only the first step and must be followed by learning the system's underlying mechanisms. This is what the discipline of causality is concerned with. Many of those complex systems contain feedback loops which means that our methods have to allow for cyclic causal relations. Furthermore, systems are rarely sufficiently isolated, which means that there are usually hidden confounders, i.e., unmeasured variables that each causally affects more than one measured variable. Finally, data is often distorted by contaminating processes, and we need to apply methods that are robust against such distortions. That's why we consider the robustness of LLC, see \cite{llc}, one of the few causal analysis methods that can deal with cyclic models with hidden confounders. Following a theoretical analysis of LLC's robustness properties, we also provide robust extensions of LLC. To facilitate reproducibility and further research in this field, we make the source code publicly available.

Robust Causal Analysis of Linear Cyclic Systems With Hidden Confounders

TL;DR

This work targets causal analysis in linear systems with cycles and hidden confounders, where data contamination can distort causal inferences. It first analyzes the robustness of the LLC algorithm and then develops robust LLC variants by substituting standard covariance estimation with MCD and Gamma Divergence Estimation for covariance inferences. Theoretical results show that the original LLC has a breakdown point of zero, and empirical tests on synthetic data reveal that the GDE-based LLC provides the strongest robustness gains, often surpassing MCD under moderate contamination. The study also emphasizes reproducibility by providing open-source code to enable broader adoption in domains requiring cyclic, confounded causal modeling.

Abstract

We live in a world full of complex systems which we need to improve our understanding of. To accomplish this, purely probabilistic investigations are often not enough. They are only the first step and must be followed by learning the system's underlying mechanisms. This is what the discipline of causality is concerned with. Many of those complex systems contain feedback loops which means that our methods have to allow for cyclic causal relations. Furthermore, systems are rarely sufficiently isolated, which means that there are usually hidden confounders, i.e., unmeasured variables that each causally affects more than one measured variable. Finally, data is often distorted by contaminating processes, and we need to apply methods that are robust against such distortions. That's why we consider the robustness of LLC, see \cite{llc}, one of the few causal analysis methods that can deal with cyclic models with hidden confounders. Following a theoretical analysis of LLC's robustness properties, we also provide robust extensions of LLC. To facilitate reproducibility and further research in this field, we make the source code publicly available.

Paper Structure

This paper contains 14 sections, 7 theorems, 28 equations, 2 figures, 3 tables.

Key Result

Lemma 1

The matrix $\mathbf{T}^k$, in general, depends on the distribution of the random vector $\mathbf{c}$.

Figures (2)

  • Figure 1: Logarithmic RFE for $\mathbf{B}$ as a function of the contamination rate for the default implementation (SCM) and the two robustified versions (MCD, GDE)
  • Figure 2: Logarithmic RFE for $\bm{\Sigma}_\mathbf{e}$ as a function of the contamination rate for the default implementation (SCM) and the two robustified versions (MCD, GDE)

Theorems & Definitions (12)

  • Lemma 1
  • proof
  • Proposition 1
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Proposition 2
  • ...and 2 more