Table of Contents
Fetching ...

Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework

Ruibo Tu, Zineb Senane, Lele Cao, Cheng Zhang, Hedvig Kjellström, Gustav Eje Henter

TL;DR

This paper tackles the problem of capturing high-order causal structure in tabular data synthesis by proposing high-order structural causal information as prior knowledge and introducing CauTabBench, a benchmark framework that generates synthetic benchmark datasets from causal DAGs and evaluates tabular synthesis models across multiple high-order tasks and downstream causal-inference challenges. The approach defines three levels of causal information (skeleton, Markov equivalence class, and full DAG) and uses causal discovery methods to extract ground-truth labels, enabling intrinsic evaluation beyond downstream task performance. Through experiments with LLM- and diffusion-based synthesis methods, the work reveals significant gaps between ideal and actual performance, shows that current methods differ in their ability to capture joint vs. individual causal information, and demonstrates that intrinsic high-order metrics provide a nuanced view of model capabilities and limitations. The framework aims to guide development of high-order causal-aware tabular synthesis with implications for real-world applications where causal reasoning under distribution shifts and cross-table contexts is essential, and its results highlight practical gaps that future methods should address.

Abstract

Tabular synthesis models remain ineffective at capturing complex dependencies, and the quality of synthetic data is still insufficient for comprehensive downstream tasks, such as prediction under distribution shifts, automated decision-making, and cross-table understanding. A major challenge is the lack of prior knowledge about underlying structures and high-order relationships in tabular data. We argue that a systematic evaluation on high-order structural information for tabular data synthesis is the first step towards solving the problem. In this paper, we introduce high-order structural causal information as natural prior knowledge and provide a benchmark framework for the evaluation of tabular synthesis models. The framework allows us to generate benchmark datasets with a flexible range of data generation processes and to train tabular synthesis models using these datasets for further evaluation. We propose multiple benchmark tasks, high-order metrics, and causal inference tasks as downstream tasks for evaluating the quality of synthetic data generated by the trained models. Our experiments demonstrate to leverage the benchmark framework for evaluating the model capability of capturing high-order structural causal information. Furthermore, our benchmarking results provide an initial assessment of state-of-the-art tabular synthesis models. They have clearly revealed significant gaps between ideal and actual performance and how baseline methods differ. Our benchmark framework is available at URL https://github.com/TURuibo/CauTabBench.

Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework

TL;DR

This paper tackles the problem of capturing high-order causal structure in tabular data synthesis by proposing high-order structural causal information as prior knowledge and introducing CauTabBench, a benchmark framework that generates synthetic benchmark datasets from causal DAGs and evaluates tabular synthesis models across multiple high-order tasks and downstream causal-inference challenges. The approach defines three levels of causal information (skeleton, Markov equivalence class, and full DAG) and uses causal discovery methods to extract ground-truth labels, enabling intrinsic evaluation beyond downstream task performance. Through experiments with LLM- and diffusion-based synthesis methods, the work reveals significant gaps between ideal and actual performance, shows that current methods differ in their ability to capture joint vs. individual causal information, and demonstrates that intrinsic high-order metrics provide a nuanced view of model capabilities and limitations. The framework aims to guide development of high-order causal-aware tabular synthesis with implications for real-world applications where causal reasoning under distribution shifts and cross-table contexts is essential, and its results highlight practical gaps that future methods should address.

Abstract

Tabular synthesis models remain ineffective at capturing complex dependencies, and the quality of synthetic data is still insufficient for comprehensive downstream tasks, such as prediction under distribution shifts, automated decision-making, and cross-table understanding. A major challenge is the lack of prior knowledge about underlying structures and high-order relationships in tabular data. We argue that a systematic evaluation on high-order structural information for tabular data synthesis is the first step towards solving the problem. In this paper, we introduce high-order structural causal information as natural prior knowledge and provide a benchmark framework for the evaluation of tabular synthesis models. The framework allows us to generate benchmark datasets with a flexible range of data generation processes and to train tabular synthesis models using these datasets for further evaluation. We propose multiple benchmark tasks, high-order metrics, and causal inference tasks as downstream tasks for evaluating the quality of synthetic data generated by the trained models. Our experiments demonstrate to leverage the benchmark framework for evaluating the model capability of capturing high-order structural causal information. Furthermore, our benchmarking results provide an initial assessment of state-of-the-art tabular synthesis models. They have clearly revealed significant gaps between ideal and actual performance and how baseline methods differ. Our benchmark framework is available at URL https://github.com/TURuibo/CauTabBench.
Paper Structure (26 sections, 5 equations, 3 figures, 10 tables)

This paper contains 26 sections, 5 equations, 3 figures, 10 tables.

Figures (3)

  • Figure 1: Our high-order structural causal benchmark framework and results. Benchmark datasets are generated based on randomly sampled causal graphs that also are used for deriving ground-truth causal information. Benchmark datasets are used for training tabular synthesis models, which generate synthetic datasets. To evaluate synthesis models on high-order structural causal information, we extract causal information from benchmark and synthetic datasets using causal discovery methods and then apply high-order structural causal metrics to compare to the known ground-truth causal information. Benchmark results demonstrate the performance of baseline methods with different metrics. The metric values computed on benchmark datasets are used as references.
  • Figure 2: Three levels of high-order structural causal information.
  • Figure 3: Benchmark on d-separations: ROC curves of the conditional independence test results for Markov equivalent class level evaluation.