Table of Contents
Fetching ...

DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal models

Patrick Blöbaum, Peter Götz, Kailash Budhathoki, Atalanti A. Mastakouri, Dominik Janzing

TL;DR

DoWhy-GCM extends the DoWhy library by representing causal systems as graphical causal models and enabling a broader set of causal inquiries beyond effect estimation. It follows a three-step recipe: define a DAG and node-level causal mechanisms, fit those mechanisms from observational data (including automatic ANM-based inference), and query the trained GCM to obtain counterfactuals, attribution, and structure-diagnosis results. The work introduces a modular, functional API that interoperates with common libraries (e.g., NetworkX, NumPy, Pandas) and supports third-party implementations, while providing native algorithms for core tasks such as falsification, edge-strength assessment, and anomaly/distribution-change attribution. Overall, DoWhy-GCM broadens causal analysis for complex systems, enabling root-cause tracing, distributional diagnostics, and counterfactual reasoning with a framework designed for extensibility and real-world integration.

Abstract

We present DoWhy-GCM, an extension of the DoWhy Python library, which leverages graphical causal models. Unlike existing causality libraries, which mainly focus on effect estimation, DoWhy-GCM addresses diverse causal queries, such as identifying the root causes of outliers and distributional changes, attributing causal influences to the data generating process of each node, or diagnosis of causal structures. With DoWhy-GCM, users typically specify cause-effect relations via a causal graph, fit causal mechanisms, and pose causal queries -- all with just a few lines of code. The general documentation is available at https://www.pywhy.org/dowhy and the DoWhy-GCM specific code at https://github.com/py-why/dowhy/tree/main/dowhy/gcm.

DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal models

TL;DR

DoWhy-GCM extends the DoWhy library by representing causal systems as graphical causal models and enabling a broader set of causal inquiries beyond effect estimation. It follows a three-step recipe: define a DAG and node-level causal mechanisms, fit those mechanisms from observational data (including automatic ANM-based inference), and query the trained GCM to obtain counterfactuals, attribution, and structure-diagnosis results. The work introduces a modular, functional API that interoperates with common libraries (e.g., NetworkX, NumPy, Pandas) and supports third-party implementations, while providing native algorithms for core tasks such as falsification, edge-strength assessment, and anomaly/distribution-change attribution. Overall, DoWhy-GCM broadens causal analysis for complex systems, enabling root-cause tracing, distributional diagnostics, and counterfactual reasoning with a framework designed for extensibility and real-world integration.

Abstract

We present DoWhy-GCM, an extension of the DoWhy Python library, which leverages graphical causal models. Unlike existing causality libraries, which mainly focus on effect estimation, DoWhy-GCM addresses diverse causal queries, such as identifying the root causes of outliers and distributional changes, attributing causal influences to the data generating process of each node, or diagnosis of causal structures. With DoWhy-GCM, users typically specify cause-effect relations via a causal graph, fit causal mechanisms, and pose causal queries -- all with just a few lines of code. The general documentation is available at https://www.pywhy.org/dowhy and the DoWhy-GCM specific code at https://github.com/py-why/dowhy/tree/main/dowhy/gcm.
Paper Structure (5 sections, 1 figure)

This paper contains 5 sections, 1 figure.

Figures (1)

  • Figure 1: DoWhy-GCM complements DoWhy by allowing to address a wide range of causal questions by utilizing graphical causal models. The graphical structure is a common data type across most of the features, which ensures interoperability.