Table of Contents
Fetching ...

LLM-initialized Differentiable Causal Discovery

Shiv Kampani, David Hidary, Constantijn van der Poel, Martin Ganahl, Brenda Miao

TL;DR

LLM-DCD is proposed, which uses an LLM to initialize the optimization of the maximum likelihood objective function of DCD approaches, thereby incorporating strong priors into the discovery method and opens up new opportunities for traditional causal discovery methods to benefit from future improvements in the causal reasoning capabilities of LLMs.

Abstract

The discovery of causal relationships between random variables is an important yet challenging problem that has applications across many scientific domains. Differentiable causal discovery (DCD) methods are effective in uncovering causal relationships from observational data; however, these approaches often suffer from limited interpretability and face challenges in incorporating domain-specific prior knowledge. In contrast, Large Language Models (LLMs)-based causal discovery approaches have recently been shown capable of providing useful priors for causal discovery but struggle with formal causal reasoning. In this paper, we propose LLM-DCD, which uses an LLM to initialize the optimization of the maximum likelihood objective function of DCD approaches, thereby incorporating strong priors into the discovery method. To achieve this initialization, we design our objective function to depend on an explicitly defined adjacency matrix of the causal graph as its only variational parameter. Directly optimizing the explicitly defined adjacency matrix provides a more interpretable approach to causal discovery. Additionally, we demonstrate higher accuracy on key benchmarking datasets of our approach compared to state-of-the-art alternatives, and provide empirical evidence that the quality of the initialization directly impacts the quality of the final output of our DCD approach. LLM-DCD opens up new opportunities for traditional causal discovery methods like DCD to benefit from future improvements in the causal reasoning capabilities of LLMs.

LLM-initialized Differentiable Causal Discovery

TL;DR

LLM-DCD is proposed, which uses an LLM to initialize the optimization of the maximum likelihood objective function of DCD approaches, thereby incorporating strong priors into the discovery method and opens up new opportunities for traditional causal discovery methods to benefit from future improvements in the causal reasoning capabilities of LLMs.

Abstract

The discovery of causal relationships between random variables is an important yet challenging problem that has applications across many scientific domains. Differentiable causal discovery (DCD) methods are effective in uncovering causal relationships from observational data; however, these approaches often suffer from limited interpretability and face challenges in incorporating domain-specific prior knowledge. In contrast, Large Language Models (LLMs)-based causal discovery approaches have recently been shown capable of providing useful priors for causal discovery but struggle with formal causal reasoning. In this paper, we propose LLM-DCD, which uses an LLM to initialize the optimization of the maximum likelihood objective function of DCD approaches, thereby incorporating strong priors into the discovery method. To achieve this initialization, we design our objective function to depend on an explicitly defined adjacency matrix of the causal graph as its only variational parameter. Directly optimizing the explicitly defined adjacency matrix provides a more interpretable approach to causal discovery. Additionally, we demonstrate higher accuracy on key benchmarking datasets of our approach compared to state-of-the-art alternatives, and provide empirical evidence that the quality of the initialization directly impacts the quality of the final output of our DCD approach. LLM-DCD opens up new opportunities for traditional causal discovery methods like DCD to benefit from future improvements in the causal reasoning capabilities of LLMs.

Paper Structure

This paper contains 16 sections, 2 theorems, 10 equations, 1 figure, 6 tables, 3 algorithms.

Key Result

Theorem 1

Under the following regularity assumptions (these assumptions have been simplified from the original assumptions listed in brouillard2020differentiablecausaldiscoveryinterventional, since we do not consider interventions): Any CGM that maximizes of the DCD objective function is equivalent to the ground-truth CGM upto a Markov equivalence class (i.e., the CGM is Markov equivalent to the ground-tru

Figures (1)

  • Figure 1: F1-score, precision, recall, and runtime of causal discovery methods on observational datasets of different sizes ($d$).

Theorems & Definitions (3)

  • Theorem : brouillard2020differentiablecausaldiscoveryinterventional
  • Theorem : Regularity of LLM-DCD
  • proof