Bridging Dense and Sparse Maximum Inner Product Search

Sebastian Bruch; Franco Maria Nardini; Amir Ingber; Edo Liberty

Bridging Dense and Sparse Maximum Inner Product Search

Sebastian Bruch, Franco Maria Nardini, Amir Ingber, Edo Liberty

TL;DR

This paper addresses the fragmentation between dense and sparse Maximum Inner Product Search (MIPS) by proposing a unified IVF-based framework that applies dense MIPS techniques to sparse and hybrid vectors. It investigates linear and non-linear dimensionality reduction—Johnson-Lindenstrauss (JL) and Sinnamon transforms—to sketch high-dimensional sparse vectors, enabling effective clustering and dynamic pruning within inverted indexes. The authors extend IVF to sketches of sparse vectors, analyze clustering strategies (standard vs. spherical KMeans), and introduce a dynamic-pruning inverted-index organization with skip pointers, demonstrating substantial throughput gains and robust performance across query distributions. Finally, they propose a unified MIPS regime for hybrid dense-sparse vectors, showing potential improvements over two-stage dense-sparse retrieval and outlining research opportunities in sparse representation learning and multi-modal retrieval. The work offers a practical, density-agnostic pathway for scalable MIPS with broad implications for lexical-semantic search and hybrid-vector applications.

Abstract

Maximum inner product search (MIPS) over dense and sparse vectors have progressed independently in a bifurcated literature for decades; the latter is better known as top-$k$ retrieval in Information Retrieval. This duality exists because sparse and dense vectors serve different end goals. That is despite the fact that they are manifestations of the same mathematical problem. In this work, we ask if algorithms for dense vectors could be applied effectively to sparse vectors, particularly those that violate the assumptions underlying top-$k$ retrieval methods. We study IVF-based retrieval where vectors are partitioned into clusters and only a fraction of clusters are searched during retrieval. We conduct a comprehensive analysis of dimensionality reduction for sparse vectors, and examine standard and spherical KMeans for partitioning. Our experiments demonstrate that IVF serves as an efficient solution for sparse MIPS. As byproducts, we identify two research opportunities and demonstrate their potential. First, we cast the IVF paradigm as a dynamic pruning technique and turn that insight into a novel organization of the inverted index for approximate MIPS for general sparse vectors. Second, we offer a unified regime for MIPS over vectors that have dense and sparse subspaces, and show its robustness to query distributions.

Bridging Dense and Sparse Maximum Inner Product Search

TL;DR

Abstract

Maximum inner product search (MIPS) over dense and sparse vectors have progressed independently in a bifurcated literature for decades; the latter is better known as top-

retrieval in Information Retrieval. This duality exists because sparse and dense vectors serve different end goals. That is despite the fact that they are manifestations of the same mathematical problem. In this work, we ask if algorithms for dense vectors could be applied effectively to sparse vectors, particularly those that violate the assumptions underlying top-

retrieval methods. We study IVF-based retrieval where vectors are partitioned into clusters and only a fraction of clusters are searched during retrieval. We conduct a comprehensive analysis of dimensionality reduction for sparse vectors, and examine standard and spherical KMeans for partitioning. Our experiments demonstrate that IVF serves as an efficient solution for sparse MIPS. As byproducts, we identify two research opportunities and demonstrate their potential. First, we cast the IVF paradigm as a dynamic pruning technique and turn that insight into a novel organization of the inverted index for approximate MIPS for general sparse vectors. Second, we offer a unified regime for MIPS over vectors that have dense and sparse subspaces, and show its robustness to query distributions.

Paper Structure (46 sections, 5 theorems, 16 equations, 11 figures, 2 tables, 6 algorithms)

This paper contains 46 sections, 5 theorems, 16 equations, 11 figures, 2 tables, 6 algorithms.

Introduction
Maximum Inner Product Search as the Unifying Problem
Sparse MIPS as a Subclass of Dense MIPS
Research Byproducts
Contributions
Structure
Related Work
Sparse MIPS
Sparse MIPS for Text Collections
Signatures for Logical Queries
General Sparse MIPS
Dense MIPS
Notation and Experimental Setup
Notation
Experimental Configuration
...and 31 more sections

Key Result

lemma 1

For $0 < \epsilon < 1$ and any set $\mathcal{V}$ of $|\mathcal{V}|$ points in $\mathbb{R}^N$, and an integer $n = \Omega(\epsilon^{-2} \ln |\mathcal{V}|)$, there exists a Lipschitz mapping $f: \mathbb{R}^N \rightarrow \mathbb{R}^n$ such that for all $u, v \in \mathcal{V}$.

Figures (11)

Figure 1: Top-$1$ accuracy of retrieval for test queries over sketches produced by JL transform (left column), Weak Sinnamon (middle column), and, as a point of reference, the original Sinnamon algorithm (right column). We retrieve the top-$k^\prime$ documents by performing an exhaustive search over the sketch collection and re-ranking the candidates by exact inner product to obtain the top-$1$ document and compute accuracy. Each line in the figures represents a different sketch size $n$. We note that Weak Sinnamon and Sinnamon only use half the sketch to record upper-bounds but leave the lower-bound sketch unused because Splade vectors are non-negative. That implies that their effective sketch size is half that of the JL transform's.
Figure 2: Top-$10$ accuracy of retrieval for test queries over sketches of size $n=1024$ produced by JL transform (left column), Weak Sinnamon (middle column), and, for reference, the original Sinnamon algorithm (right column). As in Figure \ref{['figure:sketching-quality']}, we retrieve the top-$k^\prime$ documents by performing an exhaustive search over the sketch collection and re-ranking the candidates by exact inner product to obtain the top-$10$ documents and compute accuracy. Similarly, each line in the figures represents a different sketch size $n$. In these experiments, however, we adjust the effective sketch size of Weak Sinnamon and Sinnamon to match that of the JL transform's.
Figure 3: Probability of each coordinate being non-zero ($p_i$ for coordinate $i$) for Splade and Efficient Splade vectors of several datasets. To aid visualization, we sort the coordinates by $p_i$'s in descending order. A Zipfian distribution would manifest as a line in the log-log plot. Notice that, this distribution is closer to uniform for MS Marco than others.
Figure 4: Top-$10$ accuracy of Algorithm \ref{['algorithm:retrieval']} for Splade vectors versus the number of documents examined ($\ell$)--- expressed as percentage of the size of the collection---for different clustering algorithms (standard and spherical KMeans) and different sketching mechanisms (JL transform and Weak Sinnamon, with sketching size of $1024$). Note that the vertical axis is not consistent across figures.
Figure 5: Top-$10$ accuracy of Algorithm \ref{['algorithm:retrieval']} for Efficient Splade vs. the number of documents examined ($\ell$).
...and 6 more figures

Theorems & Definitions (5)

lemma 1: Johnson-Lindenstrauss
theorem 1
theorem 2
theorem 3
theorem 4

Bridging Dense and Sparse Maximum Inner Product Search

TL;DR

Abstract

Bridging Dense and Sparse Maximum Inner Product Search

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (5)