Causal Inference with Large Language Model: A Survey
Jing Ma
TL;DR
The paper surveys the use of large language models (LLMs) for causal inference in natural language processing, outlining how LLMs can leverage domain knowledge, reasoning, and context to tackle causal tasks beyond traditional tabular data. It formalizes causality with structural causal models (SCM) and Pearl's ladder of causation, then categorizes LLM approaches into prompting, fine-tuning, hybrids with conventional causal methods, and knowledge augmentation. Across causal discovery, causal effect estimation, and other tasks like attribution, counterfactual reasoning, and explanation, the survey synthesizes datasets, evaluation results, and key insights—highlighting strong performance in pairwise discovery and more nuanced outcomes for higher-rung reasoning, dependent on prompting and model scale. The discussion points to opportunities and challenges, including integrating human knowledge, improving data generation and robustness, mitigating hallucinations, and developing causality-focused benchmarks and models with practical impact in high-stakes domains.
Abstract
Causal inference has been a pivotal challenge across diverse domains such as medicine and economics, demanding a complicated integration of human knowledge, mathematical reasoning, and data mining capabilities. Recent advancements in natural language processing (NLP), particularly with the advent of large language models (LLMs), have introduced promising opportunities for traditional causal inference tasks. This paper reviews recent progress in applying LLMs to causal inference, encompassing various tasks spanning different levels of causation. We summarize the main causal problems and approaches, and present a comparison of their evaluation results in different causal scenarios. Furthermore, we discuss key findings and outline directions for future research, underscoring the potential implications of integrating LLMs in advancing causal inference methodologies.
