Table of Contents
Fetching ...

RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong

TL;DR

KBQA systems struggle to generalize to unseen KB schemas and compositions. RnG-KBQA addresses this by coupling a contrastive, BERT-based ranker over candidate logical forms with a T5-based generator conditioned on the question and top-ranked candidates, plus an execution-augmented inference step. The approach achieves state-of-the-art results on GrailQA (EM 68.8, F1 74.4) and WebQSP (F1 75.6), with strong zero-shot and compositional generalization, and ablations confirm the additive value of both components. This rank-and-generate framework demonstrates the practical benefits of integrating ranking with generation for robust KBQA and extends to improved entity disambiguation in the pipeline.

Abstract

Existing KBQA approaches, despite achieving strong performance on i.i.d. test data, often struggle in generalizing to questions involving unseen KB schema items. Prior ranking-based approaches have shown some success in generalization, but suffer from the coverage issue. We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability. Our approach first uses a contrastive ranker to rank a set of candidate logical forms obtained by searching over the knowledge graph. It then introduces a tailored generation model conditioned on the question and the top-ranked candidates to compose the final logical form. We achieve new state-of-the-art results on GrailQA and WebQSP datasets. In particular, our method surpasses the prior state-of-the-art by a large margin on the GrailQA leaderboard. In addition, RnG-KBQA outperforms all prior approaches on the popular WebQSP benchmark, even including the ones that use the oracle entity linking. The experimental results demonstrate the effectiveness of the interplay between ranking and generation, which leads to the superior performance of our proposed approach across all settings with especially strong improvements in zero-shot generalization.

RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

TL;DR

KBQA systems struggle to generalize to unseen KB schemas and compositions. RnG-KBQA addresses this by coupling a contrastive, BERT-based ranker over candidate logical forms with a T5-based generator conditioned on the question and top-ranked candidates, plus an execution-augmented inference step. The approach achieves state-of-the-art results on GrailQA (EM 68.8, F1 74.4) and WebQSP (F1 75.6), with strong zero-shot and compositional generalization, and ablations confirm the additive value of both components. This rank-and-generate framework demonstrates the practical benefits of integrating ranking with generation for robust KBQA and extends to improved entity disambiguation in the pipeline.

Abstract

Existing KBQA approaches, despite achieving strong performance on i.i.d. test data, often struggle in generalizing to questions involving unseen KB schema items. Prior ranking-based approaches have shown some success in generalization, but suffer from the coverage issue. We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability. Our approach first uses a contrastive ranker to rank a set of candidate logical forms obtained by searching over the knowledge graph. It then introduces a tailored generation model conditioned on the question and the top-ranked candidates to compose the final logical form. We achieve new state-of-the-art results on GrailQA and WebQSP datasets. In particular, our method surpasses the prior state-of-the-art by a large margin on the GrailQA leaderboard. In addition, RnG-KBQA outperforms all prior approaches on the popular WebQSP benchmark, even including the ones that use the oracle entity linking. The experimental results demonstrate the effectiveness of the interplay between ranking and generation, which leads to the superior performance of our proposed approach across all settings with especially strong improvements in zero-shot generalization.

Paper Structure

This paper contains 26 sections, 2 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Overview of our rank-and-generate approach. Given a question, we first rank logical form candidates obtained by searching over the KB based on predefined rules. Here, the ground truth logical form is not in the top-ranked candidates as it is not covered by the rules. We solve this problem using another generation step that produces the correct logical form based on top-ranked candidates. The final logical form is executed over the KB to yield the answer.
  • Figure 2: The ranker that learns from the contrast between the ground truth and negative candidates.
  • Figure 3: The generation model conditioned on question and top-ranked candidates returned by the ranker.
  • Figure 4: Illustrative example of running entity disambiguation as ranking. A confusing entity (red) and the correct entity (green) both match the surface form in the question. To distinguish them, we train an entity disambiguation model following the same architecture as in logical form ranking but construct inputs by concatenating the question and relations.
  • Figure 5: Examples of compositional generalization to new composition of KB schema items and zero-shot generalization to unseen schema items (red).
  • ...and 2 more figures