Table of Contents
Fetching ...

Approximate Reverse $k$-Ranks Queries in High Dimensions

Daichi Amagata, Kazuyoshi Aoyama, Keito Kido, Sumio Fujita

TL;DR

The paper tackles reverse $k$-ranks queries in high-dimensional inner-product spaces by introducing a $c$-approximate variant and a novel rank-table based algorithm. By constructing per-user lower and upper bounds on ranks via a rank-table and thresholded inner-products, it enables pruning and interpolation that reduce online processing to $O(nd)$ time, avoiding the $O(nmd)$ cost of prior approaches. The proposed offline preprocessing uses norm-based bucketing and random sampling to estimate rank-table entries efficiently, achieving faster setup than the leading method QSRP. Empirical results on real-world datasets show substantial speedups, strong accuracy, and robustness to $k$ and $c$, highlighting practical value for item-centric recommendation and targeted search in high dimensions.

Abstract

Many objects are represented as high-dimensional vectors nowadays. In this setting, the relevance between two objects (vectors) is usually evaluated by their inner product. Recently, item-centric searches, which search for users relevant to query items, have received attention and find important applications, such as product promotion and market analysis. To support these applications, this paper considers reverse $k$-ranks queries. Given a query vector $\mathbf{q}$, $k$, a set $\mathbf{U}$ of user vectors, and a set $\mathbf{P}$ of item vectors, this query retrieves the $k$ user vectors $\mathbf{u} \in \mathbf{U}$ with the highest $r(\mathbf{q},\mathbf{u},\mathbf{P})$, where $r(\mathbf{q},\mathbf{u},\mathbf{P})$ shows the rank of $\mathbf{q}$ for $\mathbf{u}$ among $\mathbf{P}$. Because efficiently computing the exact answer for this query is difficult in high dimensions, we address the problem of approximate reverse $k$-ranks queries. Informally, given an approximation factor $c$, this problem allows, as an output, a user $\mathbf{u}'$ such that $r(\mathbf{q},\mathbf{u}',\mathbf{P}) > τ$ but $r(\mathbf{q},\mathbf{u}',\mathbf{P}) \leq c \times τ$, where $τ$ is the rank threshold for the exact answer. We propose a new algorithm for solving this problem efficiently. Through theoretical and empirical analyses, we confirm the efficiency and effectiveness of our algorithm.

Approximate Reverse $k$-Ranks Queries in High Dimensions

TL;DR

The paper tackles reverse -ranks queries in high-dimensional inner-product spaces by introducing a -approximate variant and a novel rank-table based algorithm. By constructing per-user lower and upper bounds on ranks via a rank-table and thresholded inner-products, it enables pruning and interpolation that reduce online processing to time, avoiding the cost of prior approaches. The proposed offline preprocessing uses norm-based bucketing and random sampling to estimate rank-table entries efficiently, achieving faster setup than the leading method QSRP. Empirical results on real-world datasets show substantial speedups, strong accuracy, and robustness to and , highlighting practical value for item-centric recommendation and targeted search in high dimensions.

Abstract

Many objects are represented as high-dimensional vectors nowadays. In this setting, the relevance between two objects (vectors) is usually evaluated by their inner product. Recently, item-centric searches, which search for users relevant to query items, have received attention and find important applications, such as product promotion and market analysis. To support these applications, this paper considers reverse -ranks queries. Given a query vector , , a set of user vectors, and a set of item vectors, this query retrieves the user vectors with the highest , where shows the rank of for among . Because efficiently computing the exact answer for this query is difficult in high dimensions, we address the problem of approximate reverse -ranks queries. Informally, given an approximation factor , this problem allows, as an output, a user such that but , where is the rank threshold for the exact answer. We propose a new algorithm for solving this problem efficiently. Through theoretical and empirical analyses, we confirm the efficiency and effectiveness of our algorithm.

Paper Structure

This paper contains 12 sections, 1 theorem, 7 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

(1) If $r^{\uparrow}_{\mathbf{u}} \leq c \times R^{\downarrow}_{k}$, $\mathbf{u}$ can be in $\mathbf{U}_{c}$. (2) If $r^{\downarrow}_{\mathbf{u}} > R^{\uparrow}_{k}$, $\mathbf{u} \notin \mathbf{U}_{c}$.

Figures (4)

  • Figure 1: Example of a rank table
  • Figure 2: Norm distributions
  • Figure 3: Impact of $k$: "$\times$" shows QSRP and "$\circ$" shows Ours.
  • Figure 4: Impact of $c$: "$\times$" shows QSRP and "$\circ$" shows Ours.

Theorems & Definitions (4)

  • Definition 1: Rank of $\mathbf{q}$ for $\mathbf{u}$
  • Definition 2: Reverse $k$-ranks query
  • Definition 3: $c$-approximate reverse $k$-ranks query
  • Lemma 1