Table of Contents
Fetching ...

HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution

Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

TL;DR

HPE introduces a neural-symbolic framework that parses complex textual questions into explicit H-expressions and executes them via a hybrid executor that combines a neural reader for single-hop questions with deterministic symbolic composition. By decoupling parsing from execution, the approach improves interpretability and generalization, delivering strong results on multi-hop datasets in supervised, few-shot, and zero-shot settings. The system employs a T5-based H-parser and a plug-in FiD reader, pre-trained on large QA corpora and fine-tuned with MuSiQue and 2WikiMultiHopQA, with zero-shot tests on HotpotQA and Natural Questions demonstrating robustness. Overall, HPE achieves state-of-the-art or competitive performance while exposing the underlying reasoning process, offering practical benefits for debugging and extending QA to KB/Table contexts.

Abstract

The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms and execute them with query engines. Towards combining the strengths of neural and symbolic methods, we propose a framework of question parsing and execution on textual QA. It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question. Hence, the proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking. The resulting H-expressions closely guide the execution process, offering higher precision besides better interpretability while still preserving the advantages of the neural readers for resolving its primitive elements. Our extensive experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings, while also effectively exposing its underlying reasoning process.

HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution

TL;DR

HPE introduces a neural-symbolic framework that parses complex textual questions into explicit H-expressions and executes them via a hybrid executor that combines a neural reader for single-hop questions with deterministic symbolic composition. By decoupling parsing from execution, the approach improves interpretability and generalization, delivering strong results on multi-hop datasets in supervised, few-shot, and zero-shot settings. The system employs a T5-based H-parser and a plug-in FiD reader, pre-trained on large QA corpora and fine-tuned with MuSiQue and 2WikiMultiHopQA, with zero-shot tests on HotpotQA and Natural Questions demonstrating robustness. Overall, HPE achieves state-of-the-art or competitive performance while exposing the underlying reasoning process, offering practical benefits for debugging and extending QA to KB/Table contexts.

Abstract

The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms and execute them with query engines. Towards combining the strengths of neural and symbolic methods, we propose a framework of question parsing and execution on textual QA. It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question. Hence, the proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking. The resulting H-expressions closely guide the execution process, offering higher precision besides better interpretability while still preserving the advantages of the neural readers for resolving its primitive elements. Our extensive experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings, while also effectively exposing its underlying reasoning process.
Paper Structure (43 sections, 5 figures, 10 tables)

This paper contains 43 sections, 5 figures, 10 tables.

Figures (5)

  • Figure 1: An illustration of H-expression.
  • Figure 2: An overview of parsing with H-parser, which involves translating the input question into an H-expression, and subsequently reshaping it into a tree-structure, facilitating the determination of the node's execution sequence.
  • Figure 3: An overview of the H-executor. It involves execution of the question node and utilization of a plug-in neural reader to provide answer feedback for each question. Subsequently, the deterministic symbolic interpreter executes the expression to yield the final answer.
  • Figure 4: 3hop2-type MuSiQue question example and how our framework finds the final answer.
  • Figure 5: Answer F1 score on each reasoning type on MuSiQue and 2WikiQA.