Table of Contents
Fetching ...

FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection

Yufei Huang, Xu Han, Maosong Sun

TL;DR

FastFiD tackles the bottleneck of FiD-based Open-Domain QA by performing sentence selection after encoding to compress context for the decoder. It uses a two-stage training regime that first jointly learns sentence selection and answer generation, then trains the model to generate answers from only the selected sentences. Empirical results on Natural Questions, TriviaQA, and ASQA show substantial inference-speed improvements (up to 2.3X–5.7X) with negligible QA degradation, and analysis confirms selected sentences are highly informative for answer construction. The approach also extends to decoder-only LLMs, highlighting broad practical impact for efficient knowledge-intensive QA systems.

Abstract

Open Domain Question Answering (ODQA) has been advancing rapidly in recent times, driven by significant developments in dense passage retrieval and pretrained language models. Current models typically incorporate the FiD framework, which is composed by a neural retriever alongside an encoder-decoder neural reader. In the answer generation process, the retriever will retrieve numerous passages (around 100 for instance), each of which is then individually encoded by the encoder. Subsequently, the decoder makes predictions based on these encoded passages. Nevertheless, this framework can be relatively time-consuming, particularly due to the extensive length of the gathered passages. To address this, we introduce FastFiD in this paper, a novel approach that executes sentence selection on the encoded passages. This aids in retaining valuable sentences while reducing the context length required for generating answers. Experiments on three commonly used datasets (Natural Questions, TriviaQA and ASQA) demonstrate that our method can enhance the inference speed by 2.3X-5.7X, while simultaneously maintaining the model's performance. Moreover, an in-depth analysis of the model's attention reveals that the selected sentences indeed hold a substantial contribution towards the final answer. The codes are publicly available at https://github.com/thunlp/FastFiD.

FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection

TL;DR

FastFiD tackles the bottleneck of FiD-based Open-Domain QA by performing sentence selection after encoding to compress context for the decoder. It uses a two-stage training regime that first jointly learns sentence selection and answer generation, then trains the model to generate answers from only the selected sentences. Empirical results on Natural Questions, TriviaQA, and ASQA show substantial inference-speed improvements (up to 2.3X–5.7X) with negligible QA degradation, and analysis confirms selected sentences are highly informative for answer construction. The approach also extends to decoder-only LLMs, highlighting broad practical impact for efficient knowledge-intensive QA systems.

Abstract

Open Domain Question Answering (ODQA) has been advancing rapidly in recent times, driven by significant developments in dense passage retrieval and pretrained language models. Current models typically incorporate the FiD framework, which is composed by a neural retriever alongside an encoder-decoder neural reader. In the answer generation process, the retriever will retrieve numerous passages (around 100 for instance), each of which is then individually encoded by the encoder. Subsequently, the decoder makes predictions based on these encoded passages. Nevertheless, this framework can be relatively time-consuming, particularly due to the extensive length of the gathered passages. To address this, we introduce FastFiD in this paper, a novel approach that executes sentence selection on the encoded passages. This aids in retaining valuable sentences while reducing the context length required for generating answers. Experiments on three commonly used datasets (Natural Questions, TriviaQA and ASQA) demonstrate that our method can enhance the inference speed by 2.3X-5.7X, while simultaneously maintaining the model's performance. Moreover, an in-depth analysis of the model's attention reveals that the selected sentences indeed hold a substantial contribution towards the final answer. The codes are publicly available at https://github.com/thunlp/FastFiD.
Paper Structure (32 sections, 12 equations, 4 figures, 10 tables)

This paper contains 32 sections, 12 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Inference Time for FiD (base) and FastFiD (base) with varying numbers of retrieved passages. As the number of retrieved passages increases, FiD encounters increasingly severe efficiency issues. Our FastFiD significantly accelerates the process by greatly reducing decoding time.
  • Figure 2: An overview of our FastFiD training pipeline. The pipeline undergoes two stages of training to empower the model with the capacity to generate answers based on the selected sentences, thereby minimizing inference time.
  • Figure 3: Sentence selection performance on NQ-Dev for HybirdFiD and FastFiD with 100 retrieved passages. Retriever means the accuracy of our retriever when retrieving 100 passages, which can be seen as an upper bound.
  • Figure 4: An example from the test set of NQ with $100$ retrieved passages. The text highlighted in yellow represents the valuable sentences identified by our FastFiD.