Ask Safely: Privacy-Aware LLM Query Generation for Knowledge Graphs
Mauro Dalle Lucca Tosi, Jordi Cabot
TL;DR
This work tackles privacy in KG QA by using graph structure as context and masking sensitive values before leveraging third-party LLMs to generate Cypher queries. The method maintains query quality while significantly reducing data leakage and prompt size, and it integrates RBAC into the query-generation process. Experiments on METAQA show robust performance with minimal accuracy loss and potential improvements through synonym substitutions and manual curation. Overall, the approach demonstrates that privacy-preserving, token-efficient LLM prompting can enable scalable KG querying without sacrificing result quality.
Abstract
Large Language Models (LLMs) are increasingly used to query knowledge graphs (KGs) due to their strong semantic understanding and extrapolation capabilities compared to traditional approaches. However, these methods cannot be applied when the KG contains sensitive data and the user lacks the resources to deploy a local generative LLM. To address this issue, we propose a privacy-aware query generation approach for KGs. Our method identifies sensitive information in the graph based on its structure and omits such values before requesting the LLM to translate natural language questions into Cypher queries. Experimental results show that our approach preserves the quality of the generated queries while preventing sensitive data from being transmitted to third-party services.
