Table of Contents
Fetching ...

CausalMan: A physics-based simulator for large-scale causality

Nicholas Tagliapietra, Juergen Luettin, Lavdim Halilaj, Moritz Willig, Tim Pychynski, Kristian Kersting

TL;DR

This paper introduces CausalMan, a physics-based simulator tailored to large-scale causality benchmarking in manufacturing, enabling ground-truth interventional data and controllable heterogeneity. Through two large SCMs derived from real-world domain knowledge and interventional capabilities, the authors benchmark a spectrum of causal inference and discovery methods, revealing substantial gaps in accuracy and scalability for nonlinear, high-dimensional, mixed-type DGPs. The work provides a rigorous experimental framework, a dataset suite, and detailed analyses of model performance and computational trade-offs, highlighting the necessity for new methodologies and human-in-the-loop strategies. By releasing the simulator and datasets, the authors aim to catalyze progress in large-scale causal reasoning and its practical deployment in industry settings.

Abstract

A comprehensive understanding of causality is critical for navigating and operating within today's complex real-world systems. The absence of realistic causal models with known data generating processes complicates fair benchmarking. In this paper, we present the CausalMan simulator, modeled after a real-world production line. The simulator features a diverse range of linear and non-linear mechanisms and challenging-to-predict behaviors, such as discrete mode changes. We demonstrate the inadequacy of many state-of-the-art approaches and analyze the significant differences in their performance and tractability, both in terms of runtime and memory complexity. As a contribution, we will release the CausalMan large-scale simulator. We present two derived datasets, and perform an extensive evaluation of both.

CausalMan: A physics-based simulator for large-scale causality

TL;DR

This paper introduces CausalMan, a physics-based simulator tailored to large-scale causality benchmarking in manufacturing, enabling ground-truth interventional data and controllable heterogeneity. Through two large SCMs derived from real-world domain knowledge and interventional capabilities, the authors benchmark a spectrum of causal inference and discovery methods, revealing substantial gaps in accuracy and scalability for nonlinear, high-dimensional, mixed-type DGPs. The work provides a rigorous experimental framework, a dataset suite, and detailed analyses of model performance and computational trade-offs, highlighting the necessity for new methodologies and human-in-the-loop strategies. By releasing the simulator and datasets, the authors aim to catalyze progress in large-scale causal reasoning and its practical deployment in industry settings.

Abstract

A comprehensive understanding of causality is critical for navigating and operating within today's complex real-world systems. The absence of realistic causal models with known data generating processes complicates fair benchmarking. In this paper, we present the CausalMan simulator, modeled after a real-world production line. The simulator features a diverse range of linear and non-linear mechanisms and challenging-to-predict behaviors, such as discrete mode changes. We demonstrate the inadequacy of many state-of-the-art approaches and analyze the significant differences in their performance and tractability, both in terms of runtime and memory complexity. As a contribution, we will release the CausalMan large-scale simulator. We present two derived datasets, and perform an extensive evaluation of both.

Paper Structure

This paper contains 48 sections, 28 equations, 15 figures, 10 tables.

Figures (15)

  • Figure 1: Complete Ground truth causal graph including hidden variables for CausalMan Medium. Observable variables are colored in orange, and latent ones are colored in blue. 419 of 605 (69.2%) of variables are latent.
  • Figure 2: CausalMan Small. Performance for ATE MSE vs. dataset size on the second interventional task. On nontrivial tasks with large amount of nonlinearities and confounders, linear regression is clearly in disadvantage.
  • Figure 3: Time to discover a Causal graph with $n = 10.000$ samples. Methods thriving on CausalMan Small may be computationally impractical on CausalMan Medium.
  • Figure 4: CausalMan Small. Figure \ref{['fig:ate_mse_vs_size']} shows a stagnation in performance for effect estimation, even with the use of more data. Figure \ref{['fig:jensen_shannon_comparison']}, instead, illustrates the JS-Div. accuracy of treated and control distributions for learning-based causal models, after training with $n = 50.000$ samples.
  • Figure 5: Example of a conditional dependency where A (categorical) determines the distribution of B. Node distributions are often not fixed a-priori, and their parameters are determined by the value of a number of categorical (parent) variables. The resulting marginal distribution can be asymmetric and multimodal.
  • ...and 10 more figures

Theorems & Definitions (1)

  • Definition 3.1