Table of Contents
Fetching ...

LLM4Causal: Democratized Causal Tools for Everyone via Large Language Model

Haitao Jiang, Lin Ge, Yuhe Gao, Jianian Wang, Rui Song

TL;DR

It is shown that LLM4Causal can deliver end-to-end solutions for causal problems and provide easy-to-understand answers, which significantly outperforms the baselines.

Abstract

Large Language Models (LLMs) have shown their success in language understanding and reasoning on general topics. However, their capability to perform inference based on user-specified structured data and knowledge in corpus-rare concepts, such as causal decision-making is still limited. In this work, we explore the possibility of fine-tuning an open-sourced LLM into LLM4Causal, which can identify the causal task, execute a corresponding function, and interpret its numerical results based on users' queries and the provided dataset. Meanwhile, we propose a data generation process for more controllable GPT prompting and present two instruction-tuning datasets: (1) Causal-Retrieval-Bench for causal problem identification and input parameter extraction for causal function calling and (2) Causal-Interpret-Bench for in-context causal interpretation. By conducting end-to-end evaluations and two ablation studies, we showed that LLM4Causal can deliver end-to-end solutions for causal problems and provide easy-to-understand answers, which significantly outperforms the baselines.

LLM4Causal: Democratized Causal Tools for Everyone via Large Language Model

TL;DR

It is shown that LLM4Causal can deliver end-to-end solutions for causal problems and provide easy-to-understand answers, which significantly outperforms the baselines.

Abstract

Large Language Models (LLMs) have shown their success in language understanding and reasoning on general topics. However, their capability to perform inference based on user-specified structured data and knowledge in corpus-rare concepts, such as causal decision-making is still limited. In this work, we explore the possibility of fine-tuning an open-sourced LLM into LLM4Causal, which can identify the causal task, execute a corresponding function, and interpret its numerical results based on users' queries and the provided dataset. Meanwhile, we propose a data generation process for more controllable GPT prompting and present two instruction-tuning datasets: (1) Causal-Retrieval-Bench for causal problem identification and input parameter extraction for causal function calling and (2) Causal-Interpret-Bench for in-context causal interpretation. By conducting end-to-end evaluations and two ablation studies, we showed that LLM4Causal can deliver end-to-end solutions for causal problems and provide easy-to-understand answers, which significantly outperforms the baselines.
Paper Structure (33 sections, 1 equation, 5 figures, 6 tables)

This paper contains 33 sections, 1 equation, 5 figures, 6 tables.

Figures (5)

  • Figure 1: User interaction with ChatGPT on causal related questions.
  • Figure 2: A flowchart of the LLM4Causal consists of three major steps: user request interpretation, causal tools assignment and execution, and output interpretation.
  • Figure 3: Causal-Retrieval-Bench construction procedures for the first step. GPT prompts used in this section can be found in Appendix \ref{['ch3:step1']}. In the left panel, GPT is prompted to list out different topics and measurable variables for each topic. With randomly drawn tasks from section 2 and some variables under the same topic, we generated JSON output in the middle panel. The colored boxes in the left and middle panels are shared topics and variable names of interests. In the right panel, the numbered boxes in template 2 mean blank places to be filled with the task description, the supplied demonstration, JSON inputs, and output restrictions. Common connectors and general instruction in prompt 2 are shortened into "xxxxx" due to the figure limit.
  • Figure 4: Causal tool assignment and execution in the second step. The algorithm is executed automatically, using both the extracted information from Step 1 and the user-provided dataset as inputs, to get the estimated result
  • Figure 5: Illustration of building Causal-Interpret-Bench in the Third Step. Details about used prompts can be found in Appendix \ref{['Step3']}. In template 3 we fixed the introduction and the final restriction parts because the instruction is about interpretation, and the remaining contextual information is filled differently every time when calling the GPT API. Similar to the idea of section \ref{['Step1:data']}, the general introduction and the restriction on outputs are reused across different causal problems. We feed the original causal query, the causal task associated with the query, and the employed methodology with corresponding function outputs to the prompt as the interpretation context.