Table of Contents
Fetching ...

Causality for Natural Language Processing

Zhijing Jin

TL;DR

Causality for Natural Language Processing investigates how LLMs reason about cause-effect relationships, how to interpret their causal reasoning, and how causal structure governs learning in NLP. The work introduces Corr2Cause to probe pure causal inference from correlations and CLadder with CausalCoT to elicit formal causal reasoning, revealing substantial limitations in off-the-shelf models and substantial gains with targeted prompting and fine-tuning. It then examines how causal/anticausal learning shapes NLP tasks, applying the independent causal mechanisms principle and demonstrating improved sentiment analysis through causality-aware prompts. Finally, the thesis applies causal inference to social science and scholarly impact estimation, including policy analysis from COVID Twitter data and a TextMatch-based CausalCite metric for paper influence, illustrating practical benefits and challenges of causal NLP in real-world data. Across four parts, the work advances benchmarks, mechanistic interpretability, data-collection practices, and causal analysis tools, laying groundwork for more robust, interpretable, and socially impactful NLP systems.

Abstract

Causal reasoning is a cornerstone of human intelligence and a critical capability for artificial systems aiming to achieve advanced understanding and decision-making. This thesis delves into various dimensions of causal reasoning and understanding in large language models (LLMs). It encompasses a series of studies that explore the causal inference skills of LLMs, the mechanisms behind their performance, and the implications of causal and anticausal learning for natural language processing (NLP) tasks. Additionally, it investigates the application of causal reasoning in text-based computational social science, specifically focusing on political decision-making and the evaluation of scientific impact through citations. Through novel datasets, benchmark tasks, and methodological frameworks, this work identifies key challenges and opportunities to improve the causal capabilities of LLMs, providing a comprehensive foundation for future research in this evolving field.

Causality for Natural Language Processing

TL;DR

Causality for Natural Language Processing investigates how LLMs reason about cause-effect relationships, how to interpret their causal reasoning, and how causal structure governs learning in NLP. The work introduces Corr2Cause to probe pure causal inference from correlations and CLadder with CausalCoT to elicit formal causal reasoning, revealing substantial limitations in off-the-shelf models and substantial gains with targeted prompting and fine-tuning. It then examines how causal/anticausal learning shapes NLP tasks, applying the independent causal mechanisms principle and demonstrating improved sentiment analysis through causality-aware prompts. Finally, the thesis applies causal inference to social science and scholarly impact estimation, including policy analysis from COVID Twitter data and a TextMatch-based CausalCite metric for paper influence, illustrating practical benefits and challenges of causal NLP in real-world data. Across four parts, the work advances benchmarks, mechanistic interpretability, data-collection practices, and causal analysis tools, laying groundwork for more robust, interpretable, and socially impactful NLP systems.

Abstract

Causal reasoning is a cornerstone of human intelligence and a critical capability for artificial systems aiming to achieve advanced understanding and decision-making. This thesis delves into various dimensions of causal reasoning and understanding in large language models (LLMs). It encompasses a series of studies that explore the causal inference skills of LLMs, the mechanisms behind their performance, and the implications of causal and anticausal learning for natural language processing (NLP) tasks. Additionally, it investigates the application of causal reasoning in text-based computational social science, specifically focusing on political decision-making and the evaluation of scientific impact through citations. Through novel datasets, benchmark tasks, and methodological frameworks, this work identifies key challenges and opportunities to improve the causal capabilities of LLMs, providing a comprehensive foundation for future research in this evolving field.

Paper Structure

This paper contains 379 sections, 35 equations, 59 figures, 58 tables, 2 algorithms.

Figures (59)

  • Figure 1: Illustration of the motivation behind our task and dataset.
  • Figure 2: Pipeline of the data construction process.
  • Figure 3: Example question in our CLadder dataset featuring an instance of Simpson's paradoxpearl2022comment. We generate the following (symbolic) triple: (i) the causal query; (ii) the ground-truth answer, derived through a causal inference enginepearl2018book; and (iii) a step-by-step explanation. We then verbalize these questions by turning them into stories, inspired by examples from the causality literature, which can be expressed in natural language.
  • Figure 4: The data-generating process of the CLadder dataset. The upper part of the figure describes the formal part of the question generation, which samples inputs for the CI Engine and derives a ground truth answer. The bottom part describes the natural language part of the question generation---i.e., its verbalization, based on multiple stories and different degrees of alignment with commonsense knowledge.
  • Figure 5: Distributions of query types in our 10K data.
  • ...and 54 more figures