Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning
Ahmet Kapkiç, Pratanu Mandal, Shu Wan, Paras Sheth, Abhinav Gorantla, Yoonhyuk Choi, Huan Liu, K. Selçuk Candan
TL;DR
The paper addresses the lack of standardized benchmarks for causal learning from observational data, where traditional ML relies on correlations. It introduces CausalBench, a flexible framework that unifies datasets, models, metrics, and evaluation services, enabling reproducible and fair benchmarking across causal inference, discovery, and interpretability tasks. Key contributions include a structured ontology for benchmark components, a provenance-enabled infrastructure with DOIs via Zenodo, and causally-informed analysis and recommendations to guide future experiments. The platform aims to accelerate rigorous, cross-domain causal learning research by facilitating collaboration, transparency, and objective comparisons.
Abstract
While witnessing the exceptional success of machine learning (ML) technologies in many applications, users are starting to notice a critical shortcoming of ML: correlation is a poor substitute for causation. The conventional way to discover causal relationships is to use randomized controlled experiments (RCT); in many situations, however, these are impractical or sometimes unethical. Causal learning from observational data offers a promising alternative. While being relatively recent, causal learning aims to go far beyond conventional machine learning, yet several major challenges remain. Unfortunately, advances are hampered due to the lack of unified benchmark datasets, algorithms, metrics, and evaluation service interfaces for causal learning. In this paper, we introduce {\em CausalBench}, a transparent, fair, and easy-to-use evaluation platform, aiming to (a) enable the advancement of research in causal learning by facilitating scientific collaboration in novel algorithms, datasets, and metrics and (b) promote scientific objectivity, reproducibility, fairness, and awareness of bias in causal learning research. CausalBench provides services for benchmarking data, algorithms, models, and metrics, impacting the needs of a broad of scientific and engineering disciplines.
