Generative Retrieval as Multi-Vector Dense Retrieval

Shiguang Wu; Wenda Wei; Mengqi Zhang; Zhumin Chen; Jun Ma; Zhaochun Ren; Maarten de Rijke; Pengjie Ren

Generative Retrieval as Multi-Vector Dense Retrieval

Shiguang Wu, Wenda Wei, Mengqi Zhang, Zhumin Chen, Jun Ma, Zhaochun Ren, Maarten de Rijke, Pengjie Ren

TL;DR

The paper investigates how Generative Retrieval (GR) relates to Multi-Vector Dense Retrieval (MVDR), revealing that GR’s decoder-based relevance can be formulated within the MVDR framework as a sum over token interactions guided by an attention-derived alignment matrix. It shows that GR and MVDR share the same core objective—to compute query-document relevance via token-level vectors and an alignment mechanism—yet differ in document encoding, alignment sparsity, and directionality. Through theoretical derivations and extensive experiments with T5-based variants on NQ320K and MS MARCO, the authors demonstrate a low-rank structure in the alignment matrices and comparable term-matching behavior, while highlighting practical differences in end-to-end versus reranking settings and the impact of improved document encoding (PAWA, NP decoding). The findings provide a principled foundation for integrating GR into MVDR-inspired frameworks and guiding future improvements in generative retrieval systems with attention to alignment strategy and document representation. The work thus offers a unified perspective with meaningful implications for designing scalable, effective neural retrieval models in practice.

Abstract

Generative retrieval generates identifiers of relevant documents in an end-to-end manner using a sequence-to-sequence architecture for a given query. The relation between generative retrieval and other retrieval methods, especially those based on matching within dense retrieval models, is not yet fully comprehended. Prior work has demonstrated that generative retrieval with atomic identifiers is equivalent to single-vector dense retrieval. Accordingly, generative retrieval exhibits behavior analogous to hierarchical search within a tree index in dense retrieval when using hierarchical semantic identifiers. However, prior work focuses solely on the retrieval stage without considering the deep interactions within the decoder of generative retrieval. In this paper, we fill this gap by demonstrating that generative retrieval and multi-vector dense retrieval share the same framework for measuring the relevance to a query of a document. Specifically, we examine the attention layer and prediction head of generative retrieval, revealing that generative retrieval can be understood as a special case of multi-vector dense retrieval. Both methods compute relevance as a sum of products of query and document vectors and an alignment matrix. We then explore how generative retrieval applies this framework, employing distinct strategies for computing document token vectors and the alignment matrix. We have conducted experiments to verify our conclusions and show that both paradigms exhibit commonalities of term matching in their alignment matrix.

Generative Retrieval as Multi-Vector Dense Retrieval

TL;DR

Abstract

Paper Structure (32 sections, 2 theorems, 24 equations, 5 figures, 8 tables)

This paper contains 32 sections, 2 theorems, 24 equations, 5 figures, 8 tables.

Introduction
Related Work
Preliminaries
In-Depth Analysis of Generative Retrieval
Model architecture and training loss
GR has the same framework as MVDR
Comparison between MVDR and GR
Document encoding
Alignment strategy
The concept of "alignment" in both methods
Different sparsity and learnability: sparse vs. dense and learned alignment matrices
Different alignment directions: query-to-document vs. docu-ment-to-query alignment
Low-rank nature of both alignment matrices
Decomposition of both relevance scores
Upshot
...and 17 more sections

Key Result

lemma 1

For a matrix $\mathbold{A} = \operatorname{softmax}\left(\mathbold{D}^\top \mathbold{W}\mathbold{Q}\right)$, there exists a rank-one matrix $\mathbold{R}$ such that where the term $\gamma$ depends on the matrix entries.

Figures (5)

Figure 1: Summary of our derivation and conclusion. The logits of GR can be reformulated as $\operatorname{sum}(\mathbold{E}_d^\top\mathbold{Q}\odot\mathbold{A})$, which corresponds to the framework $\operatorname{sum}(\mathbold{D}^\top \mathbold{Q} \odot \mathbold{A})$ of MVDR.
Figure 2: Exact match rate of MVDR on NQ320k dataset in query-to-document direction.
Figure 3: Soft exact match rate of MVDR and GR on NQ320k dataset in document-to-query direction.
Figure 4: Low-rank approximation of $\mathbold{R}$ to alignment matrix $\mathbold{A}$ in GR in MS MARCO dataset.
Figure 5: An example of alignment matrix in MVDR and GR.

Theorems & Definitions (2)

lemma 1
lemma 2

Generative Retrieval as Multi-Vector Dense Retrieval

TL;DR

Abstract

Generative Retrieval as Multi-Vector Dense Retrieval

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (2)