Prompt-Time Symbolic Knowledge Capture with Large Language Models
Tolga Çöplü, Arto Bendiken, Andrii Skomorokhov, Eduard Bateiko, Stephen Cobb, Joshua J. Bouw
TL;DR
This work targets prompt-driven symbolic knowledge capture for LLMs by defining prompt-to-triple (P2T) generation, where subjects and objects are extracted from prompts to form (subject, predicate, object) triples from a restricted relation vocabulary. It compares zero-shot prompting, few-shot prompting, and fine-tuning (via QLoRA) using a synthetic dataset and the Mistral-7B-Instruct-v0.2 model, examining both relation-level and triple-level evaluation. The results show strong relation-level recall across methods, with fine-tuning delivering the best triple-level performance, highlighting the potential of targeted fine-tuning for reliable knowledge-graph extraction from prompts. These findings advance practical integration of knowledge graphs with LLMs for user-specific knowledge in scenarios like personal AI assistants, and the provided code and dataset enable further exploration of prompt-driven symbolic knowledge capture.
Abstract
Augmenting large language models (LLMs) with user-specific knowledge is crucial for real-world applications, such as personal AI assistants. However, LLMs inherently lack mechanisms for prompt-driven knowledge capture. This paper investigates utilizing the existing LLM capabilities to enable prompt-driven knowledge capture, with a particular emphasis on knowledge graphs. We address this challenge by focusing on prompt-to-triple (P2T) generation. We explore three methods: zero-shot prompting, few-shot prompting, and fine-tuning, and then assess their performance via a specialized synthetic dataset. Our code and datasets are publicly available at https://github.com/HaltiaAI/paper-PTSKC.
