$π$-CoT: Prolog-Initialized Chain-of-Thought Prompting for Multi-Hop Question-Answering
Chao Wan, Albert Gong, Mihir Mishra, Carl-Leander Henneking, Claas Beger, Kilian Q. Weinberger
TL;DR
Multi-hop QA suffers from reasoning drift in standard Chain-of-Thought prompting within retrieval-augmented setups. π-CoT proposes a training-free, prompt-based pipeline that translates complex questions into Prolog queries, resolves them via a SLICE module, and uses the intermediate Prolog artifacts to initialize the final CoT step, effectively merging symbolic planning with neural retrieval. Across HotpotQA, 2WikiMultiHopQA, MuSiQue, and PhantomWiki, π-CoT matches or exceeds standard RAG and in-context CoT, with notable gains on harder, multi-branch questions and robustness to long contexts. The approach enhances reliability and interpretability by preserving a structured reasoning trace and enabling stepwise context management, suggesting a promising direction for symbolic-neural hybrids in open-domain QA.
Abstract
Chain-of-Thought (CoT) prompting significantly enhances large language models' (LLMs) problem-solving capabilities, but still struggles with complex multi-hop questions, often falling into circular reasoning patterns or deviating from the logical path entirely. This limitation is particularly acute in retrieval-augmented generation (RAG) settings, where obtaining the right context is critical. We introduce Prolog-Initialized Chain-of-Thought ($π$-CoT), a novel prompting strategy that combines logic programming's structural rigor with language models' flexibility. $π$-CoT reformulates multi-hop questions into Prolog queries decomposed as single-hop sub-queries. These are resolved sequentially, producing intermediate artifacts, with which we initialize the subsequent CoT reasoning procedure. Extensive experiments demonstrate that $π$-CoT significantly outperforms standard RAG and in-context CoT on multi-hop question-answering benchmarks.
