LP-LM: No Hallucinations in Question Answering with Logic Programming
Katherine Wu, Yanhong A. Liu
TL;DR
LP-LM addresses the hallucination problem in question answering by grounding responses in a knowledge base through a Prolog-based parsing pipeline. It uses probabilistic context-free grammars to produce the most probable constituency parse, converts the parse into a Prolog term, and then performs unification against a KB of Prolog-encoded sentences, aided by DCG and tabling for linear-time performance with large grammars. Experimental comparisons show LP-LM yields reliable, verifiable answers, contrasting with current LLMs that hallucinate on simple questions. The approach offers a practical, logic-based QA system with provable grounding and scalable parsing efficiency.
Abstract
Large language models (LLMs) are able to generate human-like responses to user queries. However, LLMs exhibit inherent limitations, especially because they hallucinate. This paper introduces LP-LM, a system that grounds answers to questions in known facts contained in a knowledge base (KB), facilitated through semantic parsing in Prolog, and always produces answers that are reliable. LP-LM generates a most probable constituency parse tree along with a corresponding Prolog term for an input question via Prolog definite clause grammar (DCG) parsing. The term is then executed against a KB of natural language sentences also represented as Prolog terms for question answering. By leveraging DCG and tabling, LP-LM runs in linear time in the size of input sentences for sufficiently many grammar rules. Performing experiments comparing LP-LM with current well-known LLMs in accuracy, we show that LLMs hallucinate on even simple questions, unlike LP-LM.
