Table of Contents
Fetching ...

On the Accuracy Limits of Sequential Recommender Systems: An Entropy-Based Approach

En Xu, Jingtao Ding, Yong Li

Abstract

Sequential recommender systems have achieved steady gains in offline accuracy, yet it remains unclear how close current models are to the intrinsic accuracy limit imposed by the data. A reliable, model-agnostic estimate of this ceiling would enable principled difficulty assessment and headroom estimation before costly model development. Existing predictability analyses typically combine entropy estimation with Fano's inequality inversion; however, in recommendation they are hindered by sensitivity to candidate-space specification and distortion from Fano-based scaling in low-predictability regimes. We develop an entropy-induced, training-free approach for quantifying accuracy limits in sequential recommendation, yielding a candidate-size-agnostic estimate. Experiments on controlled synthetic generators and diverse real-world benchmarks show that the estimator tracks oracle-controlled difficulty more faithfully than baselines, remains insensitive to candidate-set size, and achieves high rank consistency with best-achieved offline accuracy across state-of-the-art sequential recommenders (Spearman rho up to 0.914). It also supports user-group diagnostics by stratifying users by novelty preference, long-tail exposure, and activity, revealing systematic predictability differences. Furthermore, predictability can guide training data selection: training sets constructed from high-predictability users yield strong downstream performance under reduced data budgets. Overall, the proposed estimator provides a practical reference for assessing attainable accuracy limits, supporting user-group diagnostics, and informing data-centric decisions in sequential recommendation.

On the Accuracy Limits of Sequential Recommender Systems: An Entropy-Based Approach

Abstract

Sequential recommender systems have achieved steady gains in offline accuracy, yet it remains unclear how close current models are to the intrinsic accuracy limit imposed by the data. A reliable, model-agnostic estimate of this ceiling would enable principled difficulty assessment and headroom estimation before costly model development. Existing predictability analyses typically combine entropy estimation with Fano's inequality inversion; however, in recommendation they are hindered by sensitivity to candidate-space specification and distortion from Fano-based scaling in low-predictability regimes. We develop an entropy-induced, training-free approach for quantifying accuracy limits in sequential recommendation, yielding a candidate-size-agnostic estimate. Experiments on controlled synthetic generators and diverse real-world benchmarks show that the estimator tracks oracle-controlled difficulty more faithfully than baselines, remains insensitive to candidate-set size, and achieves high rank consistency with best-achieved offline accuracy across state-of-the-art sequential recommenders (Spearman rho up to 0.914). It also supports user-group diagnostics by stratifying users by novelty preference, long-tail exposure, and activity, revealing systematic predictability differences. Furthermore, predictability can guide training data selection: training sets constructed from high-predictability users yield strong downstream performance under reduced data budgets. Overall, the proposed estimator provides a practical reference for assessing attainable accuracy limits, supporting user-group diagnostics, and informing data-centric decisions in sequential recommendation.

Paper Structure

This paper contains 32 sections, 1 theorem, 21 equations, 11 figures, 2 tables, 1 algorithm.

Key Result

Theorem 4.1

For any history $h_t$, the predictability of the original task satisfies

Figures (11)

  • Figure 1: Overview of accuracy-limit characterization in sequential recommendation. (a) Task illustration. (b) Best-achieved offline accuracy continues to improve, while a model-agnostic attainable reference is still lacking. (c) Entropy-based predictability estimation with Fano scaling versus our entropy-induced characterization without Fano scaling.
  • Figure 2: Effect of $N$ on predictability estimates under the classical Fano mapping. For a fixed entropy level $S$, the solution $\Pi$ to Eq. (\ref{['eqn:song']}) increases with $N$, indicating that the choice of $N$ can substantially change the resulting estimate.
  • Figure 3: Entropy-induced predictability under a lossless Huffman-coding perspective. (a) Given a history $h_t$, Huffman coding assigns codewords to next-state symbols according to the conditional distribution $P(X_{t+1}\mid h_t)$. (b) An equivalent view: the probability mass function $\{p(x\mid h_t)\}$ determines symbol code lengths $\ell(x)$, so that "locating the next state on average" is operationally equivalent to "the expected code length." Since the optimal expected code length matches entropy up to an additive constant, this perspective motivates the effective size $M(h_t)=\exp(S)$ and the reference-level hit rate $\Pi_S(h_t)=1/M(h_t)$.
  • Figure 4: Predictability estimation on two synthetic generators: Session Reset and Repeat Last. The horizontal axis shows the theoretical predictability, and the vertical axis reports predictability estimates produced by different methods.
  • Figure 5: $N$-sweep results under the Context Switch generator. We fix $\mathrm{Hit@1}^{\mathrm{Oracle}}=0.15$ and vary the candidate-set size $N$ to compare how different predictability estimators change with $N$.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Theorem 4.1: Entropy-induced lower bound
  • Remark 4.1: Operational meaning of $M(h_t)$ via lossless coding