Double Correction Framework for Denoising Recommendation

Zhuangzhuang He; Yifan Wang; Yonghui Yang; Peijie Sun; Le Wu; Haoyue Bai; Jinqi Gong; Richang Hong; Min Zhang

Double Correction Framework for Denoising Recommendation

Zhuangzhuang He, Yifan Wang, Yonghui Yang, Peijie Sun, Le Wu, Haoyue Bai, Jinqi Gong, Richang Hong, Min Zhang

TL;DR

The paper addresses the challenge of noisy implicit feedback in recommender systems and critiques purely loss-based sample dropping. It introduces the Double Correction Framework (DCF), combining sample dropping correction with a progressive label correction to stabilize training and reuse informative noisily labeled data. The method employs a damping-based robust loss average and a concentration-inequality-based lower bound to identify hard samples, plus a progressive relabeling schedule to mitigate data sparsity. Empirical results across three datasets and four backbones demonstrate consistent improvements and strong ablation evidence for the components, highlighting improved robustness and data efficiency in denoising recommendations.

Abstract

As its availability and generality in online services, implicit feedback is more commonly used in recommender systems. However, implicit feedback usually presents noisy samples in real-world recommendation scenarios (such as misclicks or non-preferential behaviors), which will affect precise user preference learning. To overcome the noisy samples problem, a popular solution is based on dropping noisy samples in the model training phase, which follows the observation that noisy samples have higher training losses than clean samples. Despite the effectiveness, we argue that this solution still has limits. (1) High training losses can result from model optimization instability or hard samples, not just noisy samples. (2) Completely dropping of noisy samples will aggravate the data sparsity, which lacks full data exploitation. To tackle the above limitations, we propose a Double Correction Framework for Denoising Recommendation (DCF), which contains two correction components from views of more precise sample dropping and avoiding more sparse data. In the sample dropping correction component, we use the loss value of the samples over time to determine whether it is noise or not, increasing dropping stability. Instead of averaging directly, we use the damping function to reduce the bias effect of outliers. Furthermore, due to the higher variance exhibited by hard samples, we derive a lower bound for the loss through concentration inequality to identify and reuse hard samples. In progressive label correction, we iteratively re-label highly deterministic noisy samples and retrain them to further improve performance. Finally, extensive experimental results on three datasets and four backbones demonstrate the effectiveness and generalization of our proposed framework.

Double Correction Framework for Denoising Recommendation

TL;DR

Abstract

Paper Structure (27 sections, 21 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 21 equations, 6 figures, 5 tables, 1 algorithm.

Introduction
task description
THE PROPOSED FRAMEWORK
Overview
Sample Dropping Correction
Confirmed Loss Calculation
Cautious Hard Samples Search
Progressive Label Correction
Model Discussion
Experiments
Experimental Settings
Datasets.
Evaluation protocols.
Baselines.
Performance Comparison (RQ1)
...and 12 more sections

Figures (6)

Figure 1: (a) Illustration of unstable losses. We observe that clean samples do not exhibit low loss in every epoch. Similarly, noisy samples do not always exhibit high loss. Also, hard samples exhibit high losses. However, the noisy samples can be identified from the perspective of mean loss. (b) Different strategies in their ideal cases. Explain the difference between our relabeling strategy and other strategies under ideal conditions. Here, the TP, TN of T stands for true. Similarly, F stands for false.
Figure 2: Illustration comparison: (a) Normal training model without denoising, (b) Denoising model with drop strategy, (c) Double correction framework for denoising recommendation (Ours).
Figure 3: Details of Loss & Label Correction.
Figure 4: Impact of the relabel ratio $R$, search discretion level $\sigma^2$ and time interval $v$.
Figure 5: Comparative experiments on three datasets with two backbones validate the effectiveness of our hard sample search strategy to improve model performance.
...and 1 more figures

Double Correction Framework for Denoising Recommendation

TL;DR

Abstract

Double Correction Framework for Denoising Recommendation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)