Table of Contents
Fetching ...

Average Precision at Cutoff k under Random Rankings: Expectation and Variance

Tetiana Manzhos, Tetiana Ianevych, Olga Melnyk

TL;DR

The paper tackles the problem of establishing principled random baselines for Mean Average Precision at cutoff k, MAP@k, by deriving exact expressions for the expectation and variance of AP@k under two evaluation schemes: offline (sampling without replacement) and online (sampling with replacement). It provides closed-form formulas that incorporate harmonic numbers and scenario parameters (N,m,k for offline; p and k for online), enabling principled benchmarking of ranking algorithms. The results quantify both the expected random performance and the scale of random fluctuations, aiding interpretation of observed MAP@k scores and informing benchmarking practices. The work also includes practical illustrations via scenario analyses and simulations that compare the offline and online baselines and highlight conditions under which their behavior converges or diverges, with guidance for future methodological extensions.

Abstract

Recommender systems and information retrieval platforms rely on ranking algorithms to present the most relevant items to users, thereby improving engagement and satisfaction. Assessing the quality of these rankings requires reliable evaluation metrics. Among them, Mean Average Precision at cutoff k (MAP@k) is widely used, as it accounts for both the relevance of items and their positions in the list. In this paper, the expectation and variance of Average Precision at k (AP@k) are derived since they can be used as biselines for MAP@k. Here, we covered two widely used evaluation models: offline and online. The expectation establishes the baseline, indicating the level of MAP@k that can be achieved by pure chance. The variance complements this baseline by quantifying the extent of random fluctuations, enabling a more reliable interpretation of observed scores.

Average Precision at Cutoff k under Random Rankings: Expectation and Variance

TL;DR

The paper tackles the problem of establishing principled random baselines for Mean Average Precision at cutoff k, MAP@k, by deriving exact expressions for the expectation and variance of AP@k under two evaluation schemes: offline (sampling without replacement) and online (sampling with replacement). It provides closed-form formulas that incorporate harmonic numbers and scenario parameters (N,m,k for offline; p and k for online), enabling principled benchmarking of ranking algorithms. The results quantify both the expected random performance and the scale of random fluctuations, aiding interpretation of observed MAP@k scores and informing benchmarking practices. The work also includes practical illustrations via scenario analyses and simulations that compare the offline and online baselines and highlight conditions under which their behavior converges or diverges, with guidance for future methodological extensions.

Abstract

Recommender systems and information retrieval platforms rely on ranking algorithms to present the most relevant items to users, thereby improving engagement and satisfaction. Assessing the quality of these rankings requires reliable evaluation metrics. Among them, Mean Average Precision at cutoff k (MAP@k) is widely used, as it accounts for both the relevance of items and their positions in the list. In this paper, the expectation and variance of Average Precision at k (AP@k) are derived since they can be used as biselines for MAP@k. Here, we covered two widely used evaluation models: offline and online. The expectation establishes the baseline, indicating the level of MAP@k that can be achieved by pure chance. The variance complements this baseline by quantifying the extent of random fluctuations, enabling a more reliable interpretation of observed scores.

Paper Structure

This paper contains 6 sections, 5 theorems, 46 equations, 2 figures, 2 tables.

Key Result

Theorem 4.2

If among $N$ available items, exactly $m$ are relevant for every user, then the value $MAP_{WOR}@k$ (i.e., the expectation of $AP@k$ with respect to all possible random results under sampling without replacement) is equal to where $H_k=\sum_{i=1}^k\frac{1}{i}$ is the $k$-th "harmonic" number that can be calculated using the approximation $H_k\approx \ln k+\gamma+\frac{1}{2k}$, $\gamma=0.5772$ is

Figures (2)

  • Figure 1: Histogram of AP@k values for Scenario A3 $(N = 50, m = 25, p = 0.5, k = 40)$. The WOR distribution (blue) is shifted to higher values compared to WR (orange), reflecting normalization by $\min(m, k)$ in the offline case
  • Figure 2: Histogram of AP@k values for Scenario C $(N = 50, m = 2, p = 0.04, k = 20)$. The WR distribution (orange) is tightly concentrated around zero, while the WOR distribution (blue) shows wider variability due to the finite number of relevant items

Theorems & Definitions (12)

  • Remark 4.1
  • Theorem 4.2
  • proof
  • Remark 4.3
  • Theorem 4.4
  • Lemma 4.5
  • proof
  • proof : Proof of Theorem \ref{['th2wor']}
  • Theorem 5.1
  • proof
  • ...and 2 more