Causality for Natural Language Processing
Zhijing Jin
TL;DR
Causality for Natural Language Processing investigates how LLMs reason about cause-effect relationships, how to interpret their causal reasoning, and how causal structure governs learning in NLP. The work introduces Corr2Cause to probe pure causal inference from correlations and CLadder with CausalCoT to elicit formal causal reasoning, revealing substantial limitations in off-the-shelf models and substantial gains with targeted prompting and fine-tuning. It then examines how causal/anticausal learning shapes NLP tasks, applying the independent causal mechanisms principle and demonstrating improved sentiment analysis through causality-aware prompts. Finally, the thesis applies causal inference to social science and scholarly impact estimation, including policy analysis from COVID Twitter data and a TextMatch-based CausalCite metric for paper influence, illustrating practical benefits and challenges of causal NLP in real-world data. Across four parts, the work advances benchmarks, mechanistic interpretability, data-collection practices, and causal analysis tools, laying groundwork for more robust, interpretable, and socially impactful NLP systems.
Abstract
Causal reasoning is a cornerstone of human intelligence and a critical capability for artificial systems aiming to achieve advanced understanding and decision-making. This thesis delves into various dimensions of causal reasoning and understanding in large language models (LLMs). It encompasses a series of studies that explore the causal inference skills of LLMs, the mechanisms behind their performance, and the implications of causal and anticausal learning for natural language processing (NLP) tasks. Additionally, it investigates the application of causal reasoning in text-based computational social science, specifically focusing on political decision-making and the evaluation of scientific impact through citations. Through novel datasets, benchmark tasks, and methodological frameworks, this work identifies key challenges and opportunities to improve the causal capabilities of LLMs, providing a comprehensive foundation for future research in this evolving field.
