Subtractive random forests with two choices

Francisco Calvillo; Luc Devroye; Gábor Lugosi

Subtractive random forests with two choices

Francisco Calvillo, Luc Devroye, Gábor Lugosi

TL;DR

This work extends the Subtractive Random Forests framework to a two-choice sequential recommendation setting, where each time step offers two independent candidate topics drawn via the same temporal mechanism and a user selects the one with lower perceived value. By modeling the topic-repertoire as a directed acyclic graph with leaves representing non-positive indices, the authors define key quantities $R_n$, $L_n$, and $_n$ (the number of leaves) and analyze the long-run behavior of the system under three tail regimes: light, moderate, and heavy for the delay distributions. The main contributions are rigorous consistency results across these regimes, including almost sure and in-probability statements for the leftmost and rightmost leaves and the leaf-count under various tail assumptions, plus a generalization to $k$ choices. Simulations corroborate the theoretical predictions, illustrating how tail heaviness shapes convergence and diversity of recommendations; the work provides qualitative insights into how structural properties of sequential recommendation processes influence long-term outcomes. The study lays groundwork for future extensions incorporating personalization, context, and time-varying user preferences to bridge the gap to practical recommender systems.

Abstract

Recommendation systems are pivotal in aiding users amid vast online content. Broutin, Devroye, Lugosi, and Oliveira proposed Subtractive Random Forests (\textsc{surf}), a model that emphasizes temporal user preferences. Expanding on \textsc{surf}, we introduce a model for a multi-choice recommendation system, enabling users to select from two independent suggestions based on past interactions. We evaluate its effectiveness and robustness across diverse scenarios, incorporating heavy-tailed distributions for time delays. By analyzing user topic evolution, we assess the system's consistency. Our study offers insights into the performance and potential enhancements of multi-choice recommendation systems in practical settings.

Subtractive random forests with two choices

TL;DR

, and

(the number of leaves) and analyze the long-run behavior of the system under three tail regimes: light, moderate, and heavy for the delay distributions. The main contributions are rigorous consistency results across these regimes, including almost sure and in-probability statements for the leftmost and rightmost leaves and the leaf-count under various tail assumptions, plus a generalization to

choices. Simulations corroborate the theoretical predictions, illustrating how tail heaviness shapes convergence and diversity of recommendations; the work provides qualitative insights into how structural properties of sequential recommendation processes influence long-term outcomes. The study lays groundwork for future extensions incorporating personalization, context, and time-varying user preferences to bridge the gap to practical recommender systems.

Abstract

Paper Structure (15 sections, 18 theorems, 82 equations, 5 figures, 1 table)

This paper contains 15 sections, 18 theorems, 82 equations, 5 figures, 1 table.

Introduction
A two-choice recommendation model
Related work
Results
Summary
Simulations
The one-choice model
Rightmost and leftmost leaves
Number of leaves
Moderate tails
Heavy tails
Generalization to $k$ choices.
Conclusion
Comparison with standard baselines
Perspectives and open problems

Key Result

Theorem 2

Let $L_\infty=\inf_{n\in \mathbb{N}} L_n$. If $\mathbb{E} Z < \infty$, then $L_\infty$ is finite almost surely. Thus, the sequences $(R_n)_{n\in\mathbb{N}},(L_n)_{n\in\mathbb{N}}$ and $(\Lambda_n)_{n\in\mathbb{N}}$ are almost surely bounded.

Figures (5)

Figure 1: Illustration of the subgraph induced by $T_n$. Shown are the definitions of $L_n$, $R_n$, and $\mathcal{L}_n$. Blue edges indicate the $Z$-chain of vertices $T_n^Z$, while red edges connect all vertices in the chain $S_n$.
Figure 2: Simulation of the evolution of $V_n$ in configuration (iii) for a single run across light-tailed, moderate-tailed, and heavy-tailed regimes.
Figure 3: Evolution of the empirical mean of $V_n$ in configuration (iii)$~$over $1000$ independent simulations for light-tailed, moderate-tailed, and heavy-tailed regimes.
Figure 4: Visualization of the subgraph induced by $T_n$ in the light-, moderate-, and heavy-tailed regimes, with $n = 150$.
Figure 5: Illustration of the proof of Theorem \ref{[' L to infinity almost surely']}

Theorems & Definitions (32)

Remark 1
Definition 1
Remark 2
Theorem 2
proof
Theorem 3
proof
Theorem 4
Theorem 5
Theorem 6
...and 22 more

Subtractive random forests with two choices

TL;DR

Abstract

Subtractive random forests with two choices

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (32)