Table of Contents
Fetching ...

Crocodile: Cross Experts Covariance for Disentangled Learning in Multi-Domain Recommendation

Zhutian Lin, Junwei Pan, Haibin Yu, Xi Xiao, Ximei Wang, Zhixiang Feng, Shifeng Wen, Shudong Huang, Dapeng Liu, Lei Xiao

TL;DR

A novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains.

Abstract

Multi-domain learning (MDL) has become a prominent topic in enhancing the quality of personalized services. It's critical to learn commonalities between domains and preserve the distinct characteristics of each domain. However, this leads to a challenging dilemma in MDL. On the one hand, a model needs to leverage domain-aware modules such as experts or embeddings to preserve each domain's distinctiveness. On the other hand, real-world datasets often exhibit long-tailed distributions across domains, where some domains may lack sufficient samples to effectively train their specific modules. Unfortunately, nearly all existing work falls short of resolving this dilemma. To this end, we propose a novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains. Empirical analysis demonstrates that our method successfully addresses both challenges and outperforms all state-of-the-art methods on public datasets. During online A/B testing in Tencent's advertising platform, Crocodile achieves 0.72% CTR lift and 0.73% GMV lift on a primary advertising scenario.

Crocodile: Cross Experts Covariance for Disentangled Learning in Multi-Domain Recommendation

TL;DR

A novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains.

Abstract

Multi-domain learning (MDL) has become a prominent topic in enhancing the quality of personalized services. It's critical to learn commonalities between domains and preserve the distinct characteristics of each domain. However, this leads to a challenging dilemma in MDL. On the one hand, a model needs to leverage domain-aware modules such as experts or embeddings to preserve each domain's distinctiveness. On the other hand, real-world datasets often exhibit long-tailed distributions across domains, where some domains may lack sufficient samples to effectively train their specific modules. Unfortunately, nearly all existing work falls short of resolving this dilemma. To this end, we propose a novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains. Empirical analysis demonstrates that our method successfully addresses both challenges and outperforms all state-of-the-art methods on public datasets. During online A/B testing in Tencent's advertising platform, Crocodile achieves 0.72% CTR lift and 0.73% GMV lift on a primary advertising scenario.
Paper Structure (31 sections, 11 equations, 8 figures, 5 tables)

This paper contains 31 sections, 11 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Our proposed Crocodile successfully resolves the dilemma of preserving domain distinctiveness (measured by Diversity Index) v.s. sufficient parameters learning (measured by Information Abundance), and achieves the best performance measured by gAUC.
  • Figure 2: Information Abundance ($log(IA)$) dynamics of SDEM's bottom layer of experts and item embeddings on Kuairand1k. High and Low denote the items with the highest and lowest frequencies.
  • Figure 3: Crocodile Architecture, which consists of a Multi-Embedding (ME) layer, a Cross-expert Covariance Loss (CovLoss), and a Prior Informed Element-wise Gating (PEG) mechanism.
  • Figure 4: Architecture comparison of different MDL models with multiple embeddings.
  • Figure 5: Comparison of the singular value spectrum $log(\sigma)$ of item ID and user ID embeddings in Kuairand1k dataset. We reported $log(\sigma)$ of ME-PLE and SDEM S6-specific embedding, while the original $log(\sigma)$ of single embedding or the average of other ME methods.
  • ...and 3 more figures