Table of Contents
Fetching ...

Knowledge prompt chaining for semantic modeling

Ning Pei Ding, Jingge Du, Zaiwen Feng

TL;DR

Knowledge Prompt Chaining introduces a prompt-based framework to automatically generate semantic models for structured data by transforming graph-structured domain knowledge into system prompts for Long Context LLMs. The approach serializes both data and ontology into JSON and uses a two-step process—semantic labeling and semantic graph building—within a two-stage prompt chain, aided by pruning to suppress hallucinations. Experiments on three real-world datasets demonstrate superior accuracy and efficiency compared to state-of-the-art methods, with high labeling precision and strong performance on complex semantic graphs, while using only a small subset of data points. The method reduces manual effort, improves token efficiency, and offers flexible, scalable semantic modeling for heterogeneous structured data, with future work on automated rule extraction and meta-prompts.

Abstract

The task of building semantics for structured data such as CSV, JSON, and XML files is highly relevant in the knowledge representation field. Even though we have a vast of structured data on the internet, mapping them to domain ontologies to build semantics for them is still very challenging as it requires the construction model to understand and learn graph-structured knowledge. Otherwise, the task will require human beings' effort and cost. In this paper, we proposed a novel automatic semantic modeling framework: Knowledge Prompt Chaining. It can serialize the graph-structured knowledge and inject it into the LLMs properly in a Prompt Chaining architecture. Through this knowledge injection and prompting chaining, the model in our framework can learn the structure information and latent space of the graph and generate the semantic labels and semantic graphs following the chains' insturction naturally. Based on experimental results, our method achieves better performance than existing leading techniques, despite using reduced structured input data.

Knowledge prompt chaining for semantic modeling

TL;DR

Knowledge Prompt Chaining introduces a prompt-based framework to automatically generate semantic models for structured data by transforming graph-structured domain knowledge into system prompts for Long Context LLMs. The approach serializes both data and ontology into JSON and uses a two-step process—semantic labeling and semantic graph building—within a two-stage prompt chain, aided by pruning to suppress hallucinations. Experiments on three real-world datasets demonstrate superior accuracy and efficiency compared to state-of-the-art methods, with high labeling precision and strong performance on complex semantic graphs, while using only a small subset of data points. The method reduces manual effort, improves token efficiency, and offers flexible, scalable semantic modeling for heterogeneous structured data, with future work on automated rule extraction and meta-prompts.

Abstract

The task of building semantics for structured data such as CSV, JSON, and XML files is highly relevant in the knowledge representation field. Even though we have a vast of structured data on the internet, mapping them to domain ontologies to build semantics for them is still very challenging as it requires the construction model to understand and learn graph-structured knowledge. Otherwise, the task will require human beings' effort and cost. In this paper, we proposed a novel automatic semantic modeling framework: Knowledge Prompt Chaining. It can serialize the graph-structured knowledge and inject it into the LLMs properly in a Prompt Chaining architecture. Through this knowledge injection and prompting chaining, the model in our framework can learn the structure information and latent space of the graph and generate the semantic labels and semantic graphs following the chains' insturction naturally. Based on experimental results, our method achieves better performance than existing leading techniques, despite using reduced structured input data.
Paper Structure (26 sections, 5 equations, 3 figures, 1 table)

This paper contains 26 sections, 5 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: An overview of Knowledge Prompt Chaining Framework
  • Figure 2: Prompt Templates for prompting LLMs
  • Figure 3: Linear Relationship Between Complexity of Semantics and Performance Score