CausalMan: A physics-based simulator for large-scale causality
Nicholas Tagliapietra, Juergen Luettin, Lavdim Halilaj, Moritz Willig, Tim Pychynski, Kristian Kersting
TL;DR
This paper introduces CausalMan, a physics-based simulator tailored to large-scale causality benchmarking in manufacturing, enabling ground-truth interventional data and controllable heterogeneity. Through two large SCMs derived from real-world domain knowledge and interventional capabilities, the authors benchmark a spectrum of causal inference and discovery methods, revealing substantial gaps in accuracy and scalability for nonlinear, high-dimensional, mixed-type DGPs. The work provides a rigorous experimental framework, a dataset suite, and detailed analyses of model performance and computational trade-offs, highlighting the necessity for new methodologies and human-in-the-loop strategies. By releasing the simulator and datasets, the authors aim to catalyze progress in large-scale causal reasoning and its practical deployment in industry settings.
Abstract
A comprehensive understanding of causality is critical for navigating and operating within today's complex real-world systems. The absence of realistic causal models with known data generating processes complicates fair benchmarking. In this paper, we present the CausalMan simulator, modeled after a real-world production line. The simulator features a diverse range of linear and non-linear mechanisms and challenging-to-predict behaviors, such as discrete mode changes. We demonstrate the inadequacy of many state-of-the-art approaches and analyze the significant differences in their performance and tractability, both in terms of runtime and memory complexity. As a contribution, we will release the CausalMan large-scale simulator. We present two derived datasets, and perform an extensive evaluation of both.
