Table of Contents
Fetching ...

A Survey on Enhancing Causal Reasoning Ability of Large Language Models

Xin Li, Zhuo Cai, Shoujin Wang, Kun Yu, Fang Chen

TL;DR

This survey addresses the gap in understanding how to enhance causal reasoning in large language models (LLMs) by proposing a taxonomy that splits methods into domain knowledge driven and model driven approaches. It details subcategories including domain experts, contextual knowledge, predefined prompts, fine-tuning, causal graph construction, causal effect estimation, and counterfactual reasoning, and compares their strengths and weaknesses. The paper compiles benchmarks and metrics such as $QRDATA$, $CLEAR$, $CLADDER$, $CausalProbe-2024$, $SHD$, $SID$, $CESAR$, and $CausalScore$ to standardize evaluation, and outlines future directions spanning multi-modal reasoning, memory mechanisms, self-learning, ethical alignment, and unified datasets. Overall, it provides a structured overview to guide researchers in evaluating and improving LLMs’ causal reasoning capabilities with a view toward real-world applicability and trustworthy AI.

Abstract

Large language models (LLMs) have recently shown remarkable performance in language tasks and beyond. However, due to their limited inherent causal reasoning ability, LLMs still face challenges in handling tasks that require robust causal reasoning ability, such as health-care and economic analysis. As a result, a growing body of research has focused on enhancing the causal reasoning ability of LLMs. Despite the booming research, there lacks a survey to well review the challenges, progress and future directions in this area. To bridge this significant gap, we systematically review literature on how to strengthen LLMs' causal reasoning ability in this paper. We start from the introduction of background and motivations of this topic, followed by the summarisation of key challenges in this area. Thereafter, we propose a novel taxonomy to systematically categorise existing methods, together with detailed comparisons within and between classes of methods. Furthermore, we summarise existing benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability. Finally, we outline future research directions for this emerging field, offering insights and inspiration to researchers and practitioners in the area.

A Survey on Enhancing Causal Reasoning Ability of Large Language Models

TL;DR

This survey addresses the gap in understanding how to enhance causal reasoning in large language models (LLMs) by proposing a taxonomy that splits methods into domain knowledge driven and model driven approaches. It details subcategories including domain experts, contextual knowledge, predefined prompts, fine-tuning, causal graph construction, causal effect estimation, and counterfactual reasoning, and compares their strengths and weaknesses. The paper compiles benchmarks and metrics such as , , , , , , , and to standardize evaluation, and outlines future directions spanning multi-modal reasoning, memory mechanisms, self-learning, ethical alignment, and unified datasets. Overall, it provides a structured overview to guide researchers in evaluating and improving LLMs’ causal reasoning capabilities with a view toward real-world applicability and trustworthy AI.

Abstract

Large language models (LLMs) have recently shown remarkable performance in language tasks and beyond. However, due to their limited inherent causal reasoning ability, LLMs still face challenges in handling tasks that require robust causal reasoning ability, such as health-care and economic analysis. As a result, a growing body of research has focused on enhancing the causal reasoning ability of LLMs. Despite the booming research, there lacks a survey to well review the challenges, progress and future directions in this area. To bridge this significant gap, we systematically review literature on how to strengthen LLMs' causal reasoning ability in this paper. We start from the introduction of background and motivations of this topic, followed by the summarisation of key challenges in this area. Thereafter, we propose a novel taxonomy to systematically categorise existing methods, together with detailed comparisons within and between classes of methods. Furthermore, we summarise existing benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability. Finally, we outline future research directions for this emerging field, offering insights and inspiration to researchers and practitioners in the area.

Paper Structure

This paper contains 35 sections, 4 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Classification of methods for enhancing LLMs' causal reasoning ability
  • Figure 2: Overiew of methods for enhancing LLMs’ causal reasoning ability
  • Figure 3: Number of publications in each class per year
  • Figure 4: Classification of benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability