Causal Agent based on Large Language Model
Kairong Han, Kun Kuang, Ziyu Zhao, Junjian Ye, Fei Wu
TL;DR
This work tackles the gap between tabular causal data and NL-centric LLMs by introducing a Causal Agent that uses causal tools, a ReAct-like plan, and a memory of causal graphs to perform four-level causal reasoning on tabular data. It formalizes four levels of causal problems (variable, edge, causal graph, causal effect) and evaluates them with the CausalTQA benchmark ($\approx$1.4K questions) and real-world QRData, achieving high accuracy across levels, including $ATE = \mathbb{E}[Y(T=t_1) - Y(T=t_0)]$ estimation. The main contributions are the hierarchical problem framing, a tool-augmented LLM agent with non-textual memory, and demonstrated scalability and generalization across synthetic and real data, outperforming baselines by several points on QRData. The approach bridges causal inference and LLMs, offering interpretable, controllable automated causal reasoning for tabular data with potential broad applicability.
Abstract
The large language model (LLM) has achieved significant success across various domains. However, the inherent complexity of causal problems and causal theory poses challenges in accurately describing them in natural language, making it difficult for LLM to comprehend and use them effectively. Causal methods are not easily conveyed through natural language, which hinders LLM's ability to apply them accurately. Additionally, causal datasets are typically tabular, while LLM excels in handling natural language data, creating a structural mismatch that impedes effective reasoning with tabular data. To address these challenges, we have equipped the LLM with causal tools within an agent framework, named the Causal Agent, enabling it to tackle causal problems. The causal agent comprises tools, memory, and reasoning modules. In the tool module, the causal agent calls Python code and uses the encapsulated causal function module to align tabular data with natural language. In the reasoning module, the causal agent performs reasoning through multiple iterations with the tools. In the memory module, the causal agent maintains a dictionary instance where the keys are unique names and the values are causal graphs. To verify the causal ability of the causal agent, we established a Causal Tabular Question Answer (CausalTQA) benchmark consisting of four levels of causal problems: variable level, edge level, causal graph level, and causal effect level. CausalTQA consists of about 1.4K for these four levels questions. Causal agent demonstrates remarkable efficacy on the four-level causal problems, with accuracy rates all above 80\%. Through verification on the real-world dataset QRData, the causal agent is 6\% higher than the original SOTA. For further insights and implementation details, our code is accessible via the GitHub repository https://github.com/kairong-han/causal_agent.
