Table of Contents
Fetching ...

FlashEvaluator: Expanding Search Space with Parallel Evaluation

Chao Feng, Yuanhao Pu, Chenghao Zhang, Shanqi Liu, Shuchang Liu, Xiang Li, Yongqi Liu, Lantao Hu, Kaiqiao Zhan, Han Li, Kun Gai

TL;DR

This paper proposes FlashEvaluator, which enables cross-sequence token information sharing and processes all sequences in a single forward pass, which yields sublinear computational complexity that improves the system's efficiency and supports direct inter-sequence comparisons that improve selection accuracy.

Abstract

The Generator-Evaluator (G-E) framework, i.e., evaluating K sequences from a generator and selecting the top-ranked one according to evaluator scores, is a foundational paradigm in tasks such as Recommender Systems (RecSys) and Natural Language Processing (NLP). Traditional evaluators process sequences independently, suffering from two major limitations: (1) lack of explicit cross-sequence comparison, leading to suboptimal accuracy; (2) poor parallelization with linear complexity of O(K), resulting in inefficient resource utilization and negative impact on both throughput and latency. To address these challenges, we propose FlashEvaluator, which enables cross-sequence token information sharing and processes all sequences in a single forward pass. This yields sublinear computational complexity that improves the system's efficiency and supports direct inter-sequence comparisons that improve selection accuracy. The paper also provides theoretical proofs and extensive experiments on recommendation and NLP tasks, demonstrating clear advantages over conventional methods. Notably, FlashEvaluator has been deployed in online recommender system of Kuaishou, delivering substantial and sustained revenue gains in practice.

FlashEvaluator: Expanding Search Space with Parallel Evaluation

TL;DR

This paper proposes FlashEvaluator, which enables cross-sequence token information sharing and processes all sequences in a single forward pass, which yields sublinear computational complexity that improves the system's efficiency and supports direct inter-sequence comparisons that improve selection accuracy.

Abstract

The Generator-Evaluator (G-E) framework, i.e., evaluating K sequences from a generator and selecting the top-ranked one according to evaluator scores, is a foundational paradigm in tasks such as Recommender Systems (RecSys) and Natural Language Processing (NLP). Traditional evaluators process sequences independently, suffering from two major limitations: (1) lack of explicit cross-sequence comparison, leading to suboptimal accuracy; (2) poor parallelization with linear complexity of O(K), resulting in inefficient resource utilization and negative impact on both throughput and latency. To address these challenges, we propose FlashEvaluator, which enables cross-sequence token information sharing and processes all sequences in a single forward pass. This yields sublinear computational complexity that improves the system's efficiency and supports direct inter-sequence comparisons that improve selection accuracy. The paper also provides theoretical proofs and extensive experiments on recommendation and NLP tasks, demonstrating clear advantages over conventional methods. Notably, FlashEvaluator has been deployed in online recommender system of Kuaishou, delivering substantial and sustained revenue gains in practice.
Paper Structure (35 sections, 12 theorems, 68 equations, 2 figures, 5 tables)

This paper contains 35 sections, 12 theorems, 68 equations, 2 figures, 5 tables.

Key Result

Theorem 5.3

Suppose Assumptions ass:loss and ass:capacity hold. Fix any $\delta\in(0,1)$ and sample size $n$. Then, with probability at least $1-\delta$, the generalization gaps for both evaluators are bounded by: Consequently, the independent evaluator's bound is asymptotically $\Theta(\sqrt{K})$ times larger than the joint evaluator.

Figures (2)

  • Figure 1: Comparison of efficiency
  • Figure :

Theorems & Definitions (23)

  • Definition 3.1: Ground-truth sequence utility
  • Theorem 5.3: Comparison of surrogate bounds
  • Definition 5.4: Item Reuse Factor
  • Proposition 5.5: Computational Advantage via Redundancy Elimination
  • Lemma 1.1: Uniform deviation bound
  • Lemma 1.2: Lipschitz reduction to score classes
  • proof
  • Proposition 1.3: Surrogate generalization bound for joint Evaluator
  • proof
  • Lemma 1.4: Complexity of independent Evaluator
  • ...and 13 more