RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

Shuo Su; Xiaoshuang Chen; Yao Wang; Yulin Wu; Ziqiang Zhang; Kaiqiao Zhan; Ben Wang; Kun Gai

RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

Shuo Su, Xiaoshuang Chen, Yao Wang, Yulin Wu, Ziqiang Zhang, Kaiqiao Zhan, Ben Wang, Kun Gai

TL;DR

RPAF tackles cache-based caching under strict, global per-period budgets in large-scale recommender systems. It decomposes the problem into a prediction stage that uses a constrained RL approach with a Relaxed Local Allocator (RLA) and an allocation stage that employs PoolRank for streaming decisions, ensuring budget feasibility. The approach demonstrates superior performance to state-of-the-art baselines in offline simulations and delivers practical gains in online deployments, including improved daily watch time and user engagement. The work advances cache-augmented recommender design by explicitly modeling value-strategy dependency and enabling real-time streaming allocation under tight budgets.

Abstract

Modern recommender systems are built upon computation-intensive infrastructure, and it is challenging to perform real-time computation for each request, especially in peak periods, due to the limited computational resources. Recommending by user-wise result caches is widely used when the system cannot afford a real-time recommendation. However, it is challenging to allocate real-time and cached recommendations to maximize the users' overall engagement. This paper shows two key challenges to cache allocation, i.e., the value-strategy dependency and the streaming allocation. Then, we propose a reinforcement prediction-allocation framework (RPAF) to address these issues. RPAF is a reinforcement-learning-based two-stage framework containing prediction and allocation stages. The prediction stage estimates the values of the cache choices considering the value-strategy dependency, and the allocation stage determines the cache choices for each individual request while satisfying the global budget constraint. We show that the challenge of training RPAF includes globality and the strictness of budget constraints, and a relaxed local allocator (RLA) is proposed to address this issue. Moreover, a PoolRank algorithm is used in the allocation stage to deal with the streaming allocation problem. Experiments show that RPAF significantly improves users' engagement under computational budget constraints.

RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

TL;DR

Abstract

Paper Structure (29 sections, 3 theorems, 29 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 29 sections, 3 theorems, 29 equations, 11 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Computation Resource Allocation in Recommender Systems
RL in Recommender Systems
CMDP
Cache Allocation Problem
Simplified Cache Allocation Problem
Real Cache Allocation Problem
Value-Strategy Dependency
Streaming Allocation
Real Cache Allocation Problem
Methodology
Overall Framework
Prediction Stage with RLA
Streaming Allocation with PoolRank
...and 14 more sections

Key Result

proposition 1

Given $\mathbb{E}\left[R_t^u|a_t^u\right]$ for each $u$ and each $a_t^u\in\{0,1\}$, the solution to the CacheAlloc-Simplified is: where $\textbf{arg-top}_M$ means that $a_t^u=1$ if $u$ is in the top $M$ users with $c_t^u=1$ ranked by the given scores, and otherwise $a_t^u=0$.

Figures (11)

Figure 1: Recommendation with a result cache.
Figure 2: WatchTime decreases when continuously receiving cached recommendations.
Figure 3: The Cache Allocation Problem.
Figure 4: The Real Cache Allocation Problem.
Figure 5: The RPAF method.
...and 6 more figures

Theorems & Definitions (3)

proposition 1
corollary 1
proposition 2

RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

TL;DR

Abstract

RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (3)