Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery
Yuni Susanti, Michael Färber
TL;DR
The paper tackles knowledge-based causal discovery with small language models by injecting structured knowledge from knowledge graphs into prompts. It introduces KG Structure as Prompt, which converts KG signals—neighbor nodes, common neighbors, and metapaths—into natural-language graph context and embeds this context in prompts tailored for three SLM architectures (MLM, CLM, Seq2SeqLM). Across biomedical and open-domain datasets and under few-shot settings, the approach outperforms baselines without KG prompts and rivals full-data fine-tuning, with metapath information often yielding the strongest gains. The results demonstrate that, when augmented with structured KG context, small models can surpass larger LLMs in causal discovery tasks, highlighting the practicality of KG-informed prompting for resource-constrained settings. The work provides flexible methodology across different KGs (e.g., Wikidata, Hetionet) and model families, and releases code and data on GitHub, enabling broader adoption and extension to more complex causal graphs.
Abstract
Causal discovery aims to estimate causal structures among variables based on observational data. Large Language Models (LLMs) offer a fresh perspective to tackle the causal discovery problem by reasoning on the metadata associated with variables rather than their actual data values, an approach referred to as knowledge-based causal discovery. In this paper, we investigate the capabilities of Small Language Models (SLMs, defined as LLMs with fewer than 1 billion parameters) with prompt-based learning for knowledge-based causal discovery. Specifically, we present KG Structure as Prompt, a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning to enhance the capabilities of SLMs. Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach, surpassing most baselines and even conventional fine-tuning approaches trained on full datasets. Our findings further highlight the strong capabilities of SLMs: in combination with knowledge graphs and prompt-based learning, SLMs demonstrate the potential to surpass LLMs with larger number of parameters. Our code and datasets are available on GitHub.
