Table of Contents
Fetching ...

LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval

Yingrui Yang, Parker Carlson, Yifan Qiao, Wentai Xie, Shanxiu He, Tao Yang

TL;DR

The paper tackles efficient fusion of dense and sparse retrieval under CPU and on-disk constraints by introducing CluSD, a two-stage, cluster-based selective dense retrieval guided by sparse results. It first narrows the search to a small set of embedding clusters using an overlap-based Stage I, then applies an LSTM in Stage II to score and select clusters for dense evaluation, with final results fused via linear interpolation. Empirical results on MS MARCO and BEIR show CluSD achieves competitive MRR@10 and NDCG@10 under a ~50 ms latency budget, often outperforming IVF-based partial dense retrieval and proximity-graph methods, especially in on-disk scenarios. The approach yields significant I/O efficiency, lower memory overhead, and strong practical potential for large-scale, CPU-friendly retrieval systems, with room for further tuning and integration with newer dense models like RepLLaMA.

Abstract

This paper studies fast fusion of dense retrieval and sparse lexical retrieval, and proposes a cluster-based selective dense retrieval method called CluSD guided by sparse lexical retrieval. CluSD takes a lightweight cluster-based approach and exploits the overlap of sparse retrieval results and embedding clusters in a two-stage selection process with an LSTM model to quickly identify relevant clusters while incurring limited extra memory space overhead. CluSD triggers partial dense retrieval and performs cluster-based block disk I/O if needed. This paper evaluates CluSD and compares it with several baselines for searching in-memory and on-disk MS MARCO and BEIR datasets.

LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval

TL;DR

The paper tackles efficient fusion of dense and sparse retrieval under CPU and on-disk constraints by introducing CluSD, a two-stage, cluster-based selective dense retrieval guided by sparse results. It first narrows the search to a small set of embedding clusters using an overlap-based Stage I, then applies an LSTM in Stage II to score and select clusters for dense evaluation, with final results fused via linear interpolation. Empirical results on MS MARCO and BEIR show CluSD achieves competitive MRR@10 and NDCG@10 under a ~50 ms latency budget, often outperforming IVF-based partial dense retrieval and proximity-graph methods, especially in on-disk scenarios. The approach yields significant I/O efficiency, lower memory overhead, and strong practical potential for large-scale, CPU-friendly retrieval systems, with room for further tuning and integration with newer dense models like RepLLaMA.

Abstract

This paper studies fast fusion of dense retrieval and sparse lexical retrieval, and proposes a cluster-based selective dense retrieval method called CluSD guided by sparse lexical retrieval. CluSD takes a lightweight cluster-based approach and exploits the overlap of sparse retrieval results and embedding clusters in a two-stage selection process with an LSTM model to quickly identify relevant clusters while incurring limited extra memory space overhead. CluSD triggers partial dense retrieval and performs cluster-based block disk I/O if needed. This paper evaluates CluSD and compares it with several baselines for searching in-memory and on-disk MS MARCO and BEIR datasets.

Paper Structure

This paper contains 7 sections, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Illustration of CluSD and its features.
  • Figure 2: CluSD relevance and latency vs. the average number of clusters selected