VQPP: Video Query Performance Prediction Benchmark

Adrian Catalin Lutu; Eduard Poesina; Radu Tudor Ionescu

VQPP: Video Query Performance Prediction Benchmark

Adrian Catalin Lutu, Eduard Poesina, Radu Tudor Ionescu

TL;DR

This work proposes the first benchmark for video query performance prediction (VQPP), comprising two text-to-video retrieval datasets and two CBVR systems, respectively, and demonstrates the applicability of VQPP by employing the best performing pre-retrieval predictor as reward model for training a large language model (LLM) on the query reformulation task via direct preference optimization (DPO).

Abstract

Query performance prediction (QPP) is an important and actively studied information retrieval task, having various applications, such as query reformulation, query expansion, and retrieval system selection, among many others. The task has been primarily studied in the context of text and image retrieval, whereas QPP for content-based video retrieval (CBVR) remains largely underexplored. To this end, we propose the first benchmark for video query performance prediction (VQPP), comprising two text-to-video retrieval datasets and two CBVR systems, respectively. VQPP contains a total of 56K text queries and 51K videos, and comes with official training, validation and test splits, fostering direct comparisons and reproducible results. We explore multiple pre-retrieval and post-retrieval performance predictors, creating a representative benchmark for future exploration of QPP in the video domain. Our results show that pre-retrieval predictors obtain competitive performance, enabling applications before performing the retrieval step. We also demonstrate the applicability of VQPP by employing the best performing pre-retrieval predictor as reward model for training a large language model (LLM) on the query reformulation task via direct preference optimization (DPO). We release our benchmark and code at https://github.com/AdrianLutu/VQPP.

VQPP: Video Query Performance Prediction Benchmark

TL;DR

Abstract

Paper Structure (17 sections, 2 figures, 5 tables)

This paper contains 17 sections, 2 figures, 5 tables.

Introduction
Related Work
VQPP Benchmark
Overview
Datasets
Retrieval Systems
Organization
Evaluation Measures
Predictors
Pre-Retrieval Predictors
Post-Retrieval Predictors
Hyperparameter Configuration
Experiments and Results
Main Results
Ablation Studies
...and 2 more sections

Figures (2)

Figure 1: The query reformulation pipeline comprises three steps. ① A language model (Phi-4-mini-instruct Abouelenin-Arxiv-2025) is prompted to provide multiple query reformulations. ② The resulting queries are given as input to the fine-tuned BERT pre-retrieval predictor, which scores each query in terms of retrieval performance. ③ Reformulated queries are arranged into (winning, losing) pairs according to the rewards given by the fine-tuned BERT predictor. The language model is optimized on the resulting pairs via Online Direct Preference Optimization (DPO) Rafailov-NeurIPS-2023Qi-Arxiv-2024. Best viewed in color.
Figure 2: Examples of original and reformulated queries, along with the top two videos retrieved from the MSR-VTT xu2016msrvtt dataset by the GRAM model cicchetti2025gramian. Reformulated queries obtain higher retrieval ranks (depicted in bold). Best viewed in color.

VQPP: Video Query Performance Prediction Benchmark

TL;DR

Abstract

VQPP: Video Query Performance Prediction Benchmark

Authors

TL;DR

Abstract

Table of Contents

Figures (2)