Predicting IR Personalization Performance using Pre-retrieval Query Predictors

Eduardo Vicente-López; Luis M. de Campos; Juan M. Fernández-Luna; Juan F. Huete

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

Eduardo Vicente-López, Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete

TL;DR

The paper tackles the problem of predicting when IR personalization will improve or degrade performance by leveraging a broad suite of pre-retrieval predictors, including user-profile information. It systematically extends predictors to incorporate profiles (yielding 37 predictors) and analyzes their correlations with the personalization delta $diffPerso$, finding no single robust predictor. To boost predictive power, the authors employ per-profile Random Forest classification and regression, achieving about one-third of the ideal improvement by safely disabling personalization for harmful queries, with ASPIRE-based results (≈39% ideal gain) outperforming the user study. A feature-reduction experiment shows that using the top 10 predictors offers nearly the same gains as using all 37 when latency is critical. Overall, this work provides a promising framework for pre-retrieval personalization decisions and highlights directions for richer profile-aware predictors and future enhancements.

Abstract

Personalization generally improves the performance of queries but in a few cases it may also harms it. If we are able to predict and therefore to disable personalization for those situations, the overall performance will be higher and users will be more satisfied with personalized systems. We use some state-of-the-art pre-retrieval query performance predictors and propose some others including the user profile information for the previous purpose. We study the correlations among these predictors and the difference between the personalized and the original queries. We also use classification and regression techniques to improve the results and finally reach a bit more than one third of the maximum ideal performance. We think this is a good starting point within this research line, which certainly needs more effort and improvements.

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

TL;DR

, finding no single robust predictor. To boost predictive power, the authors employ per-profile Random Forest classification and regression, achieving about one-third of the ideal improvement by safely disabling personalization for harmful queries, with ASPIRE-based results (≈39% ideal gain) outperforming the user study. A feature-reduction experiment shows that using the top 10 predictors offers nearly the same gains as using all 37 when latency is critical. Overall, this work provides a promising framework for pre-retrieval personalization decisions and highlights directions for richer profile-aware predictors and future enhancements.

Abstract

Paper Structure (10 sections, 13 equations, 9 tables)

This paper contains 10 sections, 13 equations, 9 tables.

Introduction
Related Work
Pre-retrieval Predictors for Personalization Performance
Including the profile in predictors
Experimental Environment and Results
Experimental framework
Correlations between predictors and diffPerso
Using classification and regression techniques
Conclusions and Future Work
Appendix

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

TL;DR

Abstract

Predicting IR Personalization Performance using Pre-retrieval Query Predictors

Authors

TL;DR

Abstract

Table of Contents