Table of Contents
Fetching ...

STEM: Unleashing the Power of Embeddings for Multi-task Recommendation

Liangcai Su, Junwei Pan, Ximei Wang, Xi Xiao, Shijie Quan, Xihua Chen, Jie Jiang

TL;DR

This paper tackles negative transfer in multi-task recommender systems by introducing the STEM paradigm, which augments shared embeddings with task-specific embeddings to model diverse user preferences across tasks. The authors design STEM-Net with an All Forward Task-specific Backward gating network and independent per-task and shared embedding tables, enabling direct cross-task transfer while preserving task-specific representations. Empirical results on three public MTL datasets and extensive ablations show STEM-Net consistently outperforms state-of-the-art models, including MMoE and PLE, with notable gains on comparable-sample subsets. Online deployment in Tencent’s display platform demonstrates real-world gains in AUC and GMV, highlighting practical significance for production recommender systems.

Abstract

Multi-task learning (MTL) has gained significant popularity in recommender systems as it enables simultaneous optimization of multiple objectives. A key challenge in MTL is negative transfer, but existing studies explored negative transfer on all samples, overlooking the inherent complexities within them. We split the samples according to the relative amount of positive feedback among tasks. Surprisingly, negative transfer still occurs in existing MTL methods on samples that receive comparable feedback across tasks. Existing work commonly employs a shared-embedding paradigm, limiting the ability of modeling diverse user preferences on different tasks. In this paper, we introduce a novel Shared and Task-specific EMbeddings (STEM) paradigm that aims to incorporate both shared and task-specific embeddings to effectively capture task-specific user preferences. Under this paradigm, we propose a simple model STEM-Net, which is equipped with an All Forward Task-specific Backward gating network to facilitate the learning of task-specific embeddings and direct knowledge transfer across tasks. Remarkably, STEM-Net demonstrates exceptional performance on comparable samples, achieving positive transfer. Comprehensive evaluation on three public MTL recommendation datasets demonstrates that STEM-Net outperforms state-of-the-art models by a substantial margin. Our code is released at https://github.com/LiangcaiSu/STEM.

STEM: Unleashing the Power of Embeddings for Multi-task Recommendation

TL;DR

This paper tackles negative transfer in multi-task recommender systems by introducing the STEM paradigm, which augments shared embeddings with task-specific embeddings to model diverse user preferences across tasks. The authors design STEM-Net with an All Forward Task-specific Backward gating network and independent per-task and shared embedding tables, enabling direct cross-task transfer while preserving task-specific representations. Empirical results on three public MTL datasets and extensive ablations show STEM-Net consistently outperforms state-of-the-art models, including MMoE and PLE, with notable gains on comparable-sample subsets. Online deployment in Tencent’s display platform demonstrates real-world gains in AUC and GMV, highlighting practical significance for production recommender systems.

Abstract

Multi-task learning (MTL) has gained significant popularity in recommender systems as it enables simultaneous optimization of multiple objectives. A key challenge in MTL is negative transfer, but existing studies explored negative transfer on all samples, overlooking the inherent complexities within them. We split the samples according to the relative amount of positive feedback among tasks. Surprisingly, negative transfer still occurs in existing MTL methods on samples that receive comparable feedback across tasks. Existing work commonly employs a shared-embedding paradigm, limiting the ability of modeling diverse user preferences on different tasks. In this paper, we introduce a novel Shared and Task-specific EMbeddings (STEM) paradigm that aims to incorporate both shared and task-specific embeddings to effectively capture task-specific user preferences. Under this paradigm, we propose a simple model STEM-Net, which is equipped with an All Forward Task-specific Backward gating network to facilitate the learning of task-specific embeddings and direct knowledge transfer across tasks. Remarkably, STEM-Net demonstrates exceptional performance on comparable samples, achieving positive transfer. Comprehensive evaluation on three public MTL recommendation datasets demonstrates that STEM-Net outperforms state-of-the-art models by a substantial margin. Our code is released at https://github.com/LiangcaiSu/STEM.
Paper Structure (25 sections, 7 equations, 5 figures, 7 tables)

This paper contains 25 sections, 7 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Existing MTL models such as MMoE and PLE suffer from negative transfer on the comparable subset of TikTok test samples, while STEM-Net achieves positive transfer. STEM-Net also outperforms MMoE and PLE on task-overwhelming subsets. ST: Single-Task.
  • Figure 2: Comparison between representative MTL models and our proposed STEM-Net. Dot lines denote connections with stop-gradient operation.
  • Figure 3: Overview of STEM-Net.
  • Figure 4: Comparison of gating networks of MMoE (All Forward All Backward), PLE (Task-specific Forward Task-specific Backward) and our STEM-Net (All Forward Task-specific Backward).
  • Figure 5: The distance distribution of the contradictory user item pair set $S$ (with solid color) as well as the whole user item pair set (with slash lines) regarding: the single task Like (a) and Finish embedding (b), the PLE embedding (c), and the Like (d) and Finish-specific (e) embedding and shared embedding (f) in STEM-Net.