Large Language Models and Causal Inference in Collaboration: A Survey
Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, Yuhang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu, Julian McAuley, Wei Ai, Furong Huang
TL;DR
This survey articulates a bidirectional bridge between causal inference and large language models (LLMs), analyzing how causal frameworks can enhance LLM reasoning, fairness, safety, and explainability, including multimodal variants. It catalogs methods for improving LLMs from a causal perspective—model understanding, commonsense and counterfactual reasoning, bias mitigation, safety, and explainability—alongside evaluation benchmarks. Conversely, it surveys how LLMs can aid causal inference, notably in causal-relationship discovery and treatment-effect estimation, via counterfactual data generation, prompts, and integration with traditional causal-discovery techniques. The work highlights practical implications for building more reliable, fair, and interpretable AI systems and outlines future directions such as data augmentation for imbalanced data and alleviating unconfoundedness assumptions with language-model priors.
Abstract
Causal inference has shown potential in enhancing the predictive accuracy, fairness, robustness, and explainability of Natural Language Processing (NLP) models by capturing causal relationships among variables. The emergence of generative Large Language Models (LLMs) has significantly impacted various NLP domains, particularly through their advanced reasoning capabilities. This survey focuses on evaluating and improving LLMs from a causal view in the following areas: understanding and improving the LLMs' reasoning capacity, addressing fairness and safety issues in LLMs, complementing LLMs with explanations, and handling multimodality. Meanwhile, LLMs' strong reasoning capacities can in turn contribute to the field of causal inference by aiding causal relationship discovery and causal effect estimations. This review explores the interplay between causal inference frameworks and LLMs from both perspectives, emphasizing their collective potential to further the development of more advanced and equitable artificial intelligence systems.
