Table of Contents
Fetching ...

Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments

Sitao Cheng, Ziyuan Zhuang, Yong Xu, Fangkai Yang, Chaoyun Zhang, Xiaoting Qin, Xiang Huang, Ling Chen, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

TL;DR

Readi presents a planning-and-editing framework for reasoning over large-scale structured environments, addressing efficiency and faithfulness gaps in prior LLM-based methods. An LLM first generates an initial reasoning path, which is instantiated on knowledge graphs or tables; if instantiation fails, targeted editing is guided by a structured reasoning log, reducing unnecessary calls. Across KGQA and TableQA benchmarks, Readi outperforms most inference-based LLM baselines and many training-based methods, achieving state-of-the-art results on several datasets and notable reductions in LLM usage (average ~1.55 edits per case). The approach demonstrates how explicit reasoning paths plus environment-grounded feedback can enable accurate, efficient NL-to-SE grounding with practical impact for complex question answering in real-world structured data settings.

Abstract

Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graph and table. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous methods leverage LLMs to incrementally build a reasoning path, where the LLMs either invoke tools or pick up schemas by step-by-step interacting with the environment. We propose Reasoning-Path-Editing (Readi), a novel framework where LLMs can efficiently and faithfully reason over structured environments. In Readi, LLMs initially generate a reasoning path given a query, and edit the path only when necessary. We instantiate the path on structured environments and provide feedback to edit the path if anything goes wrong. Experimental results on three KGQA and two TableQA datasets show the effectiveness of Readi, significantly surpassing previous LLM-based methods (by 9.1% Hit@1 on WebQSP, 12.4% on MQA-3H and 9.5% on WTQ), comparable with state-of-the-art fine-tuned methods (67% on CWQ and 74.7% on WebQSP) and substantially boosting the vanilla LLMs (by 14.9% on CWQ). Our code will be available on https://aka.ms/readi.

Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments

TL;DR

Readi presents a planning-and-editing framework for reasoning over large-scale structured environments, addressing efficiency and faithfulness gaps in prior LLM-based methods. An LLM first generates an initial reasoning path, which is instantiated on knowledge graphs or tables; if instantiation fails, targeted editing is guided by a structured reasoning log, reducing unnecessary calls. Across KGQA and TableQA benchmarks, Readi outperforms most inference-based LLM baselines and many training-based methods, achieving state-of-the-art results on several datasets and notable reductions in LLM usage (average ~1.55 edits per case). The approach demonstrates how explicit reasoning paths plus environment-grounded feedback can enable accurate, efficient NL-to-SE grounding with practical impact for complex question answering in real-world structured data settings.

Abstract

Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graph and table. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous methods leverage LLMs to incrementally build a reasoning path, where the LLMs either invoke tools or pick up schemas by step-by-step interacting with the environment. We propose Reasoning-Path-Editing (Readi), a novel framework where LLMs can efficiently and faithfully reason over structured environments. In Readi, LLMs initially generate a reasoning path given a query, and edit the path only when necessary. We instantiate the path on structured environments and provide feedback to edit the path if anything goes wrong. Experimental results on three KGQA and two TableQA datasets show the effectiveness of Readi, significantly surpassing previous LLM-based methods (by 9.1% Hit@1 on WebQSP, 12.4% on MQA-3H and 9.5% on WTQ), comparable with state-of-the-art fine-tuned methods (67% on CWQ and 74.7% on WebQSP) and substantially boosting the vanilla LLMs (by 14.9% on CWQ). Our code will be available on https://aka.ms/readi.
Paper Structure (40 sections, 8 figures, 17 tables, 1 algorithm)

This paper contains 40 sections, 8 figures, 17 tables, 1 algorithm.

Figures (8)

  • Figure 1: An illustration of our proposed framework, Readi, where LLMs initially generate a reasoning path, and when necessary, edit this path. We instantiate the path on structured environments and invoke editing if the instantiation gets stuck.
  • Figure 2: Examples of the question, reasoning path, and corresponding path instances on knowledge graph.
  • Figure 3: A running example of Readi on KGQA. An LLM initially generates an reasoning path for a question. Then, we instantiate it on KG. If anything goes wrong (the path from "France"), we collect some error messages and call an LLM to edit the path. Finally, an LLM answers the question based on the KG instances.
  • Figure 4: Extensive features of Readi's reasoning path, compared with fine-tuned methods and Golden.
  • Figure 5: Distribution of number of LLM-Call for reasoning path editing of Readi-GPT4.
  • ...and 3 more figures