Table of Contents
Fetching ...

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index

Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi

TL;DR

This paper introduces Dense-Sparse Phrase Index (DenSPI), a real-time open-domain QA approach that decouples question processing from document encoding by indexing offline, query-agnostic phrase representations. By combining dense contextual embeddings with sparse tf-idf-based lexical features, DenSPI enables fast inner-product search over hundreds of billions of phrases, using memory- and compute-efficient strategies (pointers, filtering, and quantization) to fit into ~1.2 TB. The authors demonstrate substantial speedups on CPU-only setups and competitive accuracy on SQuAD v1.1 and open-domain SQuAD-Open, with ablations validating the importance of the coherency component and the hybrid search strategy. The work shows that dense-sparse phrase indexing can deliver near real-time QA at web scale, while highlighting a decomposability gap that motivates further refinements in phrase representation and search techniques.

Abstract

Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query. In this paper, we introduce the query-agnostic indexable representation of document phrases that can drastically speed up open-domain QA and also allows us to reach long-tail targets. In particular, our dense-sparse phrase encoding effectively captures syntactic, semantic, and lexical information of the phrases and eliminates the pipeline filtering of context documents. Leveraging optimization strategies, our model can be trained in a single 4-GPU server and serve entire Wikipedia (up to 60 billion phrases) under 2TB with CPUs only. Our experiments on SQuAD-Open show that our model is more accurate than DrQA (Chen et al., 2017) with 6000x reduced computational cost, which translates into at least 58x faster end-to-end inference benchmark on CPUs.

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index

TL;DR

This paper introduces Dense-Sparse Phrase Index (DenSPI), a real-time open-domain QA approach that decouples question processing from document encoding by indexing offline, query-agnostic phrase representations. By combining dense contextual embeddings with sparse tf-idf-based lexical features, DenSPI enables fast inner-product search over hundreds of billions of phrases, using memory- and compute-efficient strategies (pointers, filtering, and quantization) to fit into ~1.2 TB. The authors demonstrate substantial speedups on CPU-only setups and competitive accuracy on SQuAD v1.1 and open-domain SQuAD-Open, with ablations validating the importance of the coherency component and the hybrid search strategy. The work shows that dense-sparse phrase indexing can deliver near real-time QA at web scale, while highlighting a decomposability gap that motivates further refinements in phrase representation and search techniques.

Abstract

Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query. In this paper, we introduce the query-agnostic indexable representation of document phrases that can drastically speed up open-domain QA and also allows us to reach long-tail targets. In particular, our dense-sparse phrase encoding effectively captures syntactic, semantic, and lexical information of the phrases and eliminates the pipeline filtering of context documents. Leveraging optimization strategies, our model can be trained in a single 4-GPU server and serve entire Wikipedia (up to 60 billion phrases) under 2TB with CPUs only. Our experiments on SQuAD-Open show that our model is more accurate than DrQA (Chen et al., 2017) with 6000x reduced computational cost, which translates into at least 58x faster end-to-end inference benchmark on CPUs.

Paper Structure

This paper contains 37 sections, 9 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 1: An illustrative comparison between a pipelined QA system, e.g. DrQA drqa (left) and our proposed Dense-Sparse Phrase Index (right) for open-domain QA, best viewed in color. Dark blue vectors indicate the retrieved items from the index by the query.