Ads Recommendation in a Collapsed and Entangled World
Junwei Pan, Wei Xue, Ximei Wang, Haibin Yu, Xun Liu, Shijie Quan, Xueming Qiu, Dapeng Liu, Lei Xiao, Jie Jiang
TL;DR
This paper analyzes Tencent's ads recommender through the lens of representation learning, focusing on preserving priors for diverse feature types, mitigating embedding dimensional collapse, and disentangling user interests across tasks and scenarios. It presents a cohesive set of techniques—TIM for sequence encoding, MNSE for numeric features, and Similarity Encoding for pre-trained embeddings—alongside a multi-embedding paradigm (ME), GwPFM, and collapse-resilient interactions to scale capacity. For disentanglement, the authors introduce STEM and AME (and STEM-AL for auxiliary learning), demonstrating consistent online gains across CTR, CVR, and LTV tasks, especially for smaller or low-resource tasks. The work also offers training enhancements (ranking losses, online learning, REW, exploration with uncertainty) and practical analysis tools to measure feature correlations, dimensional collapse, and interest entanglement, illustrating substantial real-world impact in Tencent's vast online advertising platform.
Abstract
We present Tencent's ads recommendation system and examine the challenges and practices of learning appropriate recommendation representations. Our study begins by showcasing our approaches to preserving prior knowledge when encoding features of diverse types into embedding representations. We specifically address sequence features, numeric features, and pre-trained embedding features. Subsequently, we delve into two crucial challenges related to feature representation: the dimensional collapse of embeddings and the interest entanglement across different tasks or scenarios. We propose several practical approaches to address these challenges that result in robust and disentangled recommendation representations. We then explore several training techniques to facilitate model optimization, reduce bias, and enhance exploration. Additionally, we introduce three analysis tools that enable us to study feature correlation, dimensional collapse, and interest entanglement. This work builds upon the continuous efforts of Tencent's ads recommendation team over the past decade. It summarizes general design principles and presents a series of readily applicable solutions and analysis tools. The reported performance is based on our online advertising platform, which handles hundreds of billions of requests daily and serves millions of ads to billions of users.
