Table of Contents
Fetching ...

Denoising Neural Reranker for Recommender Systems

Wenyu Mao, Shuchang Liu, Hailan Yang, Xiaobei Wang, Xiaoyu Yang, Xu Gao, Xiang Li, Lantao Hu, Han Li, Kun Gai, An Zhang, Xiang Wang

TL;DR

This paper tackles the underexplored use of retriever scores to inform the reranker in a two-stage recommender system by treating reranking as a denoising problem. It introduces Denoising Neural Reranker (DNR), an adversarial framework that jointly trains a denoising reranker $q_\theta$ and a learnable noise generator $f_\boldsymbol{\phi}$ to augment and denoise retriever scores, guided by three objectives: $\mathcal{L}_z$ (denoising), $\mathcal{L}_{adv}$ (adversarial exploration), and $\mathcal{L}_x$ (score distribution regularization). The approach is supported by theoretical insights on the limitations of direct optimization and is validated through extensive experiments on three public datasets and an industrial system, showing consistent improvements over strong baselines. The findings demonstrate practical impact by improving final exposure quality while maintaining efficiency, and they suggest promising future directions for integrating noise-aware denoising into broader multi-stage recommendation pipelines.

Abstract

For multi-stage recommenders in industry, a user request would first trigger a simple and efficient retriever module that selects and ranks a list of relevant items, then the recommender calls a slower but more sophisticated reranking model that refines the item list exposure to the user. To consistently optimize the two-stage retrieval reranking framework, most efforts have focused on learning reranker-aware retrievers. In contrast, there has been limited work on how to achieve a retriever-aware reranker. In this work, we provide evidence that the retriever scores from the previous stage are informative signals that have been underexplored. Specifically, we first empirically show that the reranking task under the two-stage framework is naturally a noise reduction problem on the retriever scores, and theoretically show the limitations of naive utilization techniques of the retriever scores. Following this notion, we derive an adversarial framework DNR that associates the denoising reranker with a carefully designed noise generation module. The resulting DNR solution extends the conventional score error minimization loss with three augmented objectives, including: 1) a denoising objective that aims to denoise the noisy retriever scores to align with the user feedback; 2) an adversarial retriever score generation objective that improves the exploration in the retriever score space; and 3) a distribution regularization term that aims to align the distribution of generated noisy retriever scores with the real ones. We conduct extensive experiments on three public datasets and an industrial recommender system, together with analytical support, to validate the effectiveness of the proposed DNR.

Denoising Neural Reranker for Recommender Systems

TL;DR

This paper tackles the underexplored use of retriever scores to inform the reranker in a two-stage recommender system by treating reranking as a denoising problem. It introduces Denoising Neural Reranker (DNR), an adversarial framework that jointly trains a denoising reranker and a learnable noise generator to augment and denoise retriever scores, guided by three objectives: (denoising), (adversarial exploration), and (score distribution regularization). The approach is supported by theoretical insights on the limitations of direct optimization and is validated through extensive experiments on three public datasets and an industrial system, showing consistent improvements over strong baselines. The findings demonstrate practical impact by improving final exposure quality while maintaining efficiency, and they suggest promising future directions for integrating noise-aware denoising into broader multi-stage recommendation pipelines.

Abstract

For multi-stage recommenders in industry, a user request would first trigger a simple and efficient retriever module that selects and ranks a list of relevant items, then the recommender calls a slower but more sophisticated reranking model that refines the item list exposure to the user. To consistently optimize the two-stage retrieval reranking framework, most efforts have focused on learning reranker-aware retrievers. In contrast, there has been limited work on how to achieve a retriever-aware reranker. In this work, we provide evidence that the retriever scores from the previous stage are informative signals that have been underexplored. Specifically, we first empirically show that the reranking task under the two-stage framework is naturally a noise reduction problem on the retriever scores, and theoretically show the limitations of naive utilization techniques of the retriever scores. Following this notion, we derive an adversarial framework DNR that associates the denoising reranker with a carefully designed noise generation module. The resulting DNR solution extends the conventional score error minimization loss with three augmented objectives, including: 1) a denoising objective that aims to denoise the noisy retriever scores to align with the user feedback; 2) an adversarial retriever score generation objective that improves the exploration in the retriever score space; and 3) a distribution regularization term that aims to align the distribution of generated noisy retriever scores with the real ones. We conduct extensive experiments on three public datasets and an industrial recommender system, together with analytical support, to validate the effectiveness of the proposed DNR.

Paper Structure

This paper contains 41 sections, 10 equations, 8 figures, 15 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) The informative retriever score and (b-d) noise reduction nature of reranking. The reranker in this example uses transformer model to select the top-20 items from the candidates. "Rerank w/ s" represents using retriever scores as additional item features for the reranker. The black shaded circles in (c) and (d) represent the retriever's selection of candidate items, and the green shaded circle in (d) represents those of the reranker for exposure. (e) compares the noise distribution (distance between predicted scores and ground-truth labels) of the retriever and the reranker.
  • Figure 2: Overall framework of multi-stage recommender system (on the left) and the noise reduction formulation of our method, DNR.
  • Figure 3: Sensitivity analysis of hyperparameters (i.e., $\lambda_c$, $\lambda_m$, and $\lambda_e$) for our method on the Kuaivideo dataset.
  • Figure 4: Different ways to leverage retriever scores.
  • Figure 5: The visualization of generated noise from different variants.
  • ...and 3 more figures