The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

Omar Besbes; Yash Kanoria; Akshit Kumar

The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

Omar Besbes, Yash Kanoria, Akshit Kumar

TL;DR

The paper demonstrates a fundamental misalignment between engagement maximization and user utility in recommendation systems. It analyzes a stylized infinite-horizon model with popular and niche items, showing that engagement-optimized policies can underperform on utility, especially when niche value is uncertain but potentially high for a minority. Introducing PEAR, a utility-aware exploratory policy, the authors prove that near-optimal utility can be achieved with only modest engagement sacrifices, and that this advantage grows as platforms become more forward-looking. Robustness checks with general distributions and prior-free policies (DICE) show that exploration can substantially improve utility without large engagement penalties in many regimes, highlighting a practical path to better long-term discovery in recommender systems.

Abstract

Recommendation systems are widespread, and through customized recommendations, promise to match users with options they will like. To that end, data on engagement is collected and used. Most recommendation systems are ranking-based, where they rank and recommend items based on their predicted engagement. However, the engagement signals are often only a crude proxy for utility, as data on the latter is rarely collected or available. This paper explores the following question: By optimizing for measurable proxies, are recommendation systems at risk of significantly under-delivering on utility? If so, how can one improve utility which is seldom measured? To study these questions, we introduce a model of repeated user consumption in which, at each interaction, users select between an outside option and the best option from a recommendation set. Our model accounts for user heterogeneity, with the majority preferring ``popular'' content, and a minority favoring ``niche'' content. The system initially lacks knowledge of individual user preferences but can learn them through observations of users' choices over time. Our theoretical and numerical analysis demonstrate that optimizing for engagement can lead to significant utility losses. Instead, we propose a utility-aware policy that initially recommends a mix of popular and niche content. As the platform becomes more forward-looking, our utility-aware policy achieves the best of both worlds: near-optimal utility and near-optimal engagement simultaneously. Our study elucidates an important feature of recommendation systems; given the ability to suggest multiple items, one can perform significant exploration without incurring significant reductions in engagement. By recommending high-risk, high-reward items alongside popular items, systems can enhance discovery of high utility items without significantly affecting engagement.

The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

TL;DR

Abstract

Paper Structure (16 sections, 7 theorems, 29 equations, 3 figures, 2 algorithms)

This paper contains 16 sections, 7 theorems, 29 equations, 3 figures, 2 algorithms.

Introduction
Related Literature
Organization of the paper
Model
Analysis of the Two-point distribution for the niche type
APP: Engagement Optimal Policy
PEAR: A utility-aware heuristic
Asymmetry in the engagement-utility misalignment
Robustness of Insights Under General Settings
Conclusion
Useful Technical Result
Proof of Theorem \ref{['thm:engagement-maximizing-policy-two-recommendations']}
Proof of Theorem \ref{['thm:utility-maximizing-policy-two-recommednations']}
Proof of Helper Lemmas
Proof of Lemma \ref{['lem:stopping-time-equivalence']}
...and 1 more sections

Key Result

Theorem 1

Fix the base utility of the popular item type $V_{\sf P} \in \mathbb{R}_{+}$ and the discount factor $\delta \in [0,1)$. There exists a $p_0 = p_0(\delta, V_{\sf P}) \in [0,1]$ such that for all $p \leq p_{0}$, then ${\sf APP}$ as defined in eq:always-popular-policymaximizes engagement as defined in

Figures (3)

Figure 2: CDF of Generalized Pareto Distribution for $\mu = -1$ and $\sigma = 1 - \xi$
Figure 3: Impact of $\xi$ on utility and engagement for $\delta = 0$
Figure 4: Impact of $\xi$ on utility and engagement for $\delta = 0.999$

Theorems & Definitions (7)

Theorem 1: Engagement Optimal Policy
Theorem 2: Analysis of PEAR
Corollary 1: Asymptotic Optimality of PEAR
Corollary 2: Asymmetry in the Misalignment
Proposition EC.1
Lemma App.1
Lemma App.2

The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

TL;DR

Abstract

The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (7)