PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs
Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang
TL;DR
PipeNet tackles KG-based QA by introducing a grounding-pruning-reasoning pipeline that uses dependency-parsing signals to prune noisy external nodes before graph reasoning. A DP-pruning module scores nodes via dependency span distances and reduces subgraph size, while a simplified Graph Attention Network fuses LM context with the pruned subgraph to predict answers. The approach yields strong accuracy on CommonsenseQA and OpenBookQA, with substantial memory and time savings at high pruning rates, demonstrating that semantically informed pruning enhances both efficiency and subgraph quality. The method confirms that explicit knowledge graphs continue to provide complementary value to implicit LM knowledge in QA tasks, and the code is made available for reproducibility and further development.
Abstract
It is well acknowledged that incorporating explicit knowledge graphs (KGs) can benefit question answering. Existing approaches typically follow a grounding-reasoning pipeline in which entity nodes are first grounded for the query (question and candidate answers), and then a reasoning module reasons over the matched multi-hop subgraph for answer prediction. Although the pipeline largely alleviates the issue of extracting essential information from giant KGs, efficiency is still an open challenge when scaling up hops in grounding the subgraphs. In this paper, we target at finding semantically related entity nodes in the subgraph to improve the efficiency of graph reasoning with KG. We propose a grounding-pruning-reasoning pipeline to prune noisy nodes, remarkably reducing the computation cost and memory usage while also obtaining decent subgraph representation. In detail, the pruning module first scores concept nodes based on the dependency distance between matched spans and then prunes the nodes according to score ranks. To facilitate the evaluation of pruned subgraphs, we also propose a graph attention network (GAT) based module to reason with the subgraph data. Experimental results on CommonsenseQA and OpenBookQA demonstrate the effectiveness of our method.
