Gradually Excavating External Knowledge for Implicit Complex Question Answering

Chang Liu; Xiaoguang Li; Lifeng Shang; Xin Jiang; Qun Liu; Edmund Y. Lam; Ngai Wong

Gradually Excavating External Knowledge for Implicit Complex Question Answering

Chang Liu, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Edmund Y. Lam, Ngai Wong

TL;DR

This work proposes a gradual knowledge excavation framework for open-domain complex question answering, where LLMs iteratively and actively acquire external information, and then reason based on acquired historical knowledge.

Abstract

Recently, large language models (LLMs) have gained much attention for the emergence of human-comparable capabilities and huge potential. However, for open-domain implicit question-answering problems, LLMs may not be the ultimate solution due to the reasons of: 1) uncovered or out-of-date domain knowledge, 2) one-shot generation and hence restricted comprehensiveness. To this end, this work proposes a gradual knowledge excavation framework for open-domain complex question answering, where LLMs iteratively and actively acquire external information, and then reason based on acquired historical knowledge. Specifically, during each step of the solving process, the model selects an action to execute, such as querying external knowledge or performing a single logical reasoning step, to gradually progress toward a final answer. Our method can effectively leverage plug-and-play external knowledge and dynamically adjust the strategy for solving complex questions. Evaluated on the StrategyQA dataset, our method achieves 78.17% accuracy with less than 6% parameters of its competitors, setting new SOTA for ~10B-scale LLMs.

Gradually Excavating External Knowledge for Implicit Complex Question Answering

TL;DR

Abstract

Paper Structure (22 sections, 4 equations, 3 figures, 6 tables)

This paper contains 22 sections, 4 equations, 3 figures, 6 tables.

Introduction
Related Work
Problem definition
Gradually Excavating External Knowledge
Core model
Retriever
Extractor
GEEK Pipeline and Action Space
Strategy Exploration
Case Study
Experiment
Dataset and Preprocessing
Detailed Settings
Comparison with other Baselines
Ablation Study
...and 7 more sections

Figures (3)

Figure 1: LLMs fail to solve open-domain complex questions due to unrecognized entities and implicit strategies. (1) In the upper part, the LLM fails to answer the question with the one-shot generation, for there is no off-the-shelf answer or evidence to this question. However, the question can be decomposed into several sub-questions and be solved once the citizenship contradiction is identified. If the hint of 'citizenship contradiction' is also given, the LLM can successfully solve the question with the inner knowledge now. (2) But for the bottom case with less well-known entities, the LLM fails again due to a lack of specialized knowledge about 'Aisin-Gioro Yizhu' and hence rejects to answer. Moreover, the strategy of the 'political system' is not likely to be discovered from the question text only, unless enough knowledge is provided. 'Citizenship contradiction' is also a possible solution.
Figure 2: GEEK workflow: the core model, retriever and extractor collaborate to solve complex questions progressively. (Left): In each iteration, based on the question state $\mathcal{Q}_t$, GEEK selects an action and calls the corresponding module to execute. The execution updates the question state in turn, until a final answer $\mathpzc{z}$ is derived. (Right): The detailed procedure of action selection and execution. For action selection, $\mathcal{Q}_t$ and $\mathcal{A}$ are fed into the core model with the instruction for action selection, and the model outputs an action code $\mathpzc{a}$. For the execution of Add Decomp, Final Answer, and Self Answer, the core model outputs corresponding responses following different instructions. At last, for Retrieve & Extract, the retriever firstly retrieves several paragraphs $\mathcal{P}$ from the corpus as external knowledge, and then the extractor answers the decomposition question $d_t$ based on $\mathcal{P}$.
Figure 3: Full process of GEEK inference. For each round, the prompts are shown in gray, and the current question state is also given to the model as input. Model responses are shown in green and action selection is represented by a red circle to save space. On the top right corner, the question state is listed, where the historical states of sub-question and fact are gradually added during the inference (best viewed in color and numerical marks).

Gradually Excavating External Knowledge for Implicit Complex Question Answering

TL;DR

Abstract

Gradually Excavating External Knowledge for Implicit Complex Question Answering

Authors

TL;DR

Abstract

Table of Contents

Figures (3)