Table of Contents
Fetching ...

Efficiency and Effectiveness of SPLADE Models on Billion-Scale Web Document Title

Taeryun Won, Tae Kwan Lee, Hiun Kim, Hyemin Lee

TL;DR

This study benchmarks BM25, SPLADE, and Expanded-SPLADE for billion-scale web-title retrieval, focusing on the trade-off between effectiveness and efficiency when using sparse lexical representations. It demonstrates that sparse models improve retrieval quality over BM25, with Expanded-SPLADE offering the most practical balance by combining competitive effectiveness with lower latency via pruning. The authors introduce document-centric pruning, top-k query term selection, and boolean thresholding to reduce compute, showing substantial latency gains with modest loss in retrieval quality. The results inform deployment of sparse retrieval in large-scale search engines, especially for languages with non-Latin scripts like Korean.

Abstract

This paper presents a comprehensive comparison of BM25, SPLADE, and Expanded-SPLADE models in the context of large-scale web document retrieval. We evaluate the effectiveness and efficiency of these models on datasets spanning from tens of millions to billions of web document titles. SPLADE and Expanded-SPLADE, which utilize sparse lexical representations, demonstrate superior retrieval performance compared to BM25, especially for complex queries. However, these models incur higher computational costs. We introduce pruning strategies, including document-centric pruning and top-k query term selection, boolean query with term threshold to mitigate these costs and improve the models' efficiency without significantly sacrificing retrieval performance. The results show that Expanded-SPLADE strikes the best balance between effectiveness and efficiency, particularly when handling large datasets. Our findings offer valuable insights for deploying sparse retrieval models in large-scale search engines.

Efficiency and Effectiveness of SPLADE Models on Billion-Scale Web Document Title

TL;DR

This study benchmarks BM25, SPLADE, and Expanded-SPLADE for billion-scale web-title retrieval, focusing on the trade-off between effectiveness and efficiency when using sparse lexical representations. It demonstrates that sparse models improve retrieval quality over BM25, with Expanded-SPLADE offering the most practical balance by combining competitive effectiveness with lower latency via pruning. The authors introduce document-centric pruning, top-k query term selection, and boolean thresholding to reduce compute, showing substantial latency gains with modest loss in retrieval quality. The results inform deployment of sparse retrieval in large-scale search engines, especially for languages with non-Latin scripts like Korean.

Abstract

This paper presents a comprehensive comparison of BM25, SPLADE, and Expanded-SPLADE models in the context of large-scale web document retrieval. We evaluate the effectiveness and efficiency of these models on datasets spanning from tens of millions to billions of web document titles. SPLADE and Expanded-SPLADE, which utilize sparse lexical representations, demonstrate superior retrieval performance compared to BM25, especially for complex queries. However, these models incur higher computational costs. We introduce pruning strategies, including document-centric pruning and top-k query term selection, boolean query with term threshold to mitigate these costs and improve the models' efficiency without significantly sacrificing retrieval performance. The results show that Expanded-SPLADE strikes the best balance between effectiveness and efficiency, particularly when handling large datasets. Our findings offer valuable insights for deploying sparse retrieval models in large-scale search engines.

Paper Structure

This paper contains 19 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Performance vs FLOPS on Small Dataset
  • Figure 2: Performance vs Latency on Large Dataset
  • Figure 3: Experimental Results of Boolean Query with Term Threshold on the Large Dataset using Expanded-SPLADE. The term threshold increases in 20% increments from right to left, starting from 0% on the rightmost node and progressing to 80% on the leftmost node.
  • Figure 4: Effect of Combining Document Pruning, Query Term Selection, and Boolean Query Thresholding on Retrieval Efficiency and Performance in the Large Dataset. For each model configuration (e.g., Splade-qk0-dk10), the term threshold increases in 20% increments from 0% (rightmost node) to 80% (leftmost node). Here, qk represents the k value for query term selection, and dk represents the k value for document-centric pruning. A value of k=0 indicates that no pruning was applied for the corresponding method.