RecDCL: Dual Contrastive Learning for Recommendation
Dan Zhang, Yangliao Geng, Wenwen Gong, Zhongang Qi, Zhiyu Chen, Xing Tang, Ying Shan, Yuxiao Dong, Jie Tang
TL;DR
RecDCL tackles sparse user-item data by jointly optimizing batch-wise contrastive learning (BCL) and feature-wise contrastive learning (FCL) within a dual-contrastive framework. It defines two objective components, $L_{UIBT}$ and $L_{UUII}$, and combines them with a batch-wise loss $L_{BCL}$ as $\\mathcal{\\L} = L_{UIBT} + \alpha L_{UUII} + \beta L_{BCL}$, while standardizing embeddings to reveal the native connection between BCL and FCL. The paper proves that combining BCL and FCL reduces redundant solutions without missing optimal ones and demonstrates robustness and improvements over state-of-the-art methods on four public datasets and one industrial dataset, including up to $5.65\%$ recall and $5.34\%$ NDCG gains. The authors release the public code and show promising practical impact for real-world recommender systems.
Abstract
Self-supervised learning (SSL) has recently achieved great success in mining the user-item interactions for collaborative filtering. As a major paradigm, contrastive learning (CL) based SSL helps address data sparsity in Web platforms by contrasting the embeddings between raw and augmented data. However, existing CL-based methods mostly focus on contrasting in a batch-wise way, failing to exploit potential regularity in the feature dimension. This leads to redundant solutions during the representation learning of users and items. In this work, we investigate how to employ both batch-wise CL (BCL) and feature-wise CL (FCL) for recommendation. We theoretically analyze the relation between BCL and FCL, and find that combining BCL and FCL helps eliminate redundant solutions but never misses an optimal solution. We propose a dual contrastive learning recommendation framework -- RecDCL. In RecDCL, the FCL objective is designed to eliminate redundant solutions on user-item positive pairs and to optimize the uniform distributions within users and items using a polynomial kernel for driving the representations to be orthogonal; The BCL objective is utilized to generate contrastive embeddings on output vectors for enhancing the robustness of the representations. Extensive experiments on four widely-used benchmarks and one industry dataset demonstrate that RecDCL can consistently outperform the state-of-the-art GNNs-based and SSL-based models (with an improvement of up to 5.65\% in terms of Recall@20). The source code is publicly available (https://github.com/THUDM/RecDCL).
