The Benefits of Diversity: Combining Comparisons and Ratings for Efficient Scoring
Julien Fageot, Matthias Grossglauser, Lê-Nguyên Hoang, Matteo Tacchi-Bénard, Oscar Villemaud
TL;DR
This work addresses how to elicit human preferences most efficiently by unifying direct ratings and pairwise comparisons in a single probabilistic framework. The proposed SCoRa model jointly reasons over embeddings, comparisons, and ratings using a generalized Bradley–Terry formulation with a learnable threshold, and it provides MAP guarantees, monotonicity, and Lipschitz resilience. The authors demonstrate convergence and robustness on synthetic data and uncover realistic regimes where mixing ratings and comparisons yields superior top-item scoring, particularly when active learning prioritizes comparisons among top entities. The findings offer a flexible, scalable basis for preference learning in applications like content recommendation and model alignment, with implications for how to allocate user effort across feedback types.
Abstract
Should humans be asked to evaluate entities individually or comparatively? This question has been the subject of long debates. In this work, we show that, interestingly, combining both forms of preference elicitation can outperform the focus on a single kind. More specifically, we introduce SCoRa (Scoring from Comparisons and Ratings), a unified probabilistic model that allows to learn from both signals. We prove that the MAP estimator of SCoRa is well-behaved. It verifies monotonicity and robustness guarantees. We then empirically show that SCoRa recovers accurate scores, even under model mismatch. Most interestingly, we identify a realistic setting where combining comparisons and ratings outperforms using either one alone, and when the accurate ordering of top entities is critical. Given the de facto availability of signals of multiple forms, SCoRa additionally offers a versatile foundation for preference learning.
