Table of Contents
Fetching ...

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

Haoran Luo, Haihong E, Yikai Guo, Qika Lin, Xiaobao Wu, Xinyu Mu, Wenhao Liu, Meina Song, Yifan Zhu, Luu Anh Tuan

TL;DR

KBQA-o1 tackles knowledge-base question answering under limited annotated data by an agentic framework that combines a ReAct-style prompt with Monte Carlo Tree Search to explore the KB environment and generate executable logical forms. It integrates a policy model and a reward model to guide search, and uses incremental fine-tuning on auto-annotated data to steadily improve performance with minimal supervision. Across GrailQA, WebQSP, and GraphQ, KBQA-o1 with open-source LLMs achieves substantial gains over prior low-resource methods and rivals fully supervised systems, especially on compositional and zero-shot tasks. The approach is plug-and-play with multiple LLMs and scalable, making it practical for diverse KBQA deployments.

Abstract

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo. Our code is publicly available.

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

TL;DR

KBQA-o1 tackles knowledge-base question answering under limited annotated data by an agentic framework that combines a ReAct-style prompt with Monte Carlo Tree Search to explore the KB environment and generate executable logical forms. It integrates a policy model and a reward model to guide search, and uses incremental fine-tuning on auto-annotated data to steadily improve performance with minimal supervision. Across GrailQA, WebQSP, and GraphQ, KBQA-o1 with open-source LLMs achieves substantial gains over prior low-resource methods and rivals fully supervised systems, especially on compositional and zero-shot tasks. The approach is plug-and-play with multiple LLMs and scalable, making it practical for diverse KBQA deployments.

Abstract

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with a large-scale structured knowledge base (KB). Despite advancements with large language models (LLMs), KBQA still faces challenges in weak KB awareness, imbalance between effectiveness and efficiency, and high reliance on annotated data. To address these challenges, we propose KBQA-o1, a novel agentic KBQA method with Monte Carlo Tree Search (MCTS). It introduces a ReAct-based agent process for stepwise logical form generation with KB environment exploration. Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space. With heuristic exploration, KBQA-o1 generates high-quality annotations for further improvement by incremental fine-tuning. Experimental results show that KBQA-o1 outperforms previous low-resource KBQA methods with limited annotated data, boosting Llama-3.1-8B model's GrailQA F1 performance to 78.5% compared to 48.5% of the previous sota method with GPT-3.5-turbo. Our code is publicly available.

Paper Structure

This paper contains 34 sections, 3 theorems, 43 equations, 9 figures, 7 tables, 1 algorithm.

Key Result

Proposition 4.1

The agent's awareness of the environment makes it more effective in generating optimal logical forms compared to end-to-end methods.

Figures (9)

  • Figure 1: An example of KBQA task to answer a natural language question by exploring the KB environment with KBQA agent.
  • Figure 2: Comparison of the previous end-to-end KBQA framework, including retrieve-then-generate (RG) and generate-then-retrieve (GR) methods, step-by-step KBQA methods, including CoT-based and ToT-based methods, and our proposed heuristic KBQA method, which is agentic MCTS-based. With the same Llama-3.1-8B llama3 as the base model, both the MCTS-based agent process and the full KBQA-o1 after the incremental fine-tuning show improvements on all three KBQA datasets.
  • Figure 3: An example of the heuristic KB environment exploration with MCTS driven by policy and reward models.
  • Figure 4: Performance and efficiency comparison of Llama-3.1-8B-based KBQA-o1 with compared methods. (a) F1 scores comparison across datasets. (b) F1 scores across logical operators on GrailQA. (c) Trade-off between F1 scores and queries per minute on GrailQA.
  • Figure 5: Impact of incremental fine-tuning tested on GrailQA: (a) Effect of exploration samples on F1 and EM scores. (b) Relationship between reward threshold, data ratio, and performance. (c) Influence of MCTS exploration weight on query efficiency and accuracy.
  • ...and 4 more figures

Theorems & Definitions (9)

  • Proposition 4.1
  • proof
  • Proposition 4.2
  • proof
  • Proposition 4.3
  • proof
  • proof
  • proof
  • proof