Table of Contents
Fetching ...

GradCraft: Elevating Multi-task Recommendations through Holistic Gradient Crafting

Yimeng Bai, Yang Zhang, Fuli Feng, Jing Lu, Xiaoxue Zang, Chenyi Lei, Yang Song

TL;DR

The target of multi-task learning is set as attaining the appropriate magnitude balance and the global direction balance, and an innovative methodology named GradCraft is proposed in response, ensuring the concurrent achievement of appropriate magnitude balance and global direction balance.

Abstract

Recommender systems require the simultaneous optimization of multiple objectives to accurately model user interests, necessitating the application of multi-task learning methods. However, existing multi-task learning methods in recommendations overlook the specific characteristics of recommendation scenarios, falling short in achieving proper gradient balance. To address this challenge, we set the target of multi-task learning as attaining the appropriate magnitude balance and the global direction balance, and propose an innovative methodology named GradCraft in response. GradCraft dynamically adjusts gradient magnitudes to align with the maximum gradient norm, mitigating interference from gradient magnitudes for subsequent manipulation. It then employs projections to eliminate gradient conflicts in directions while considering all conflicting tasks simultaneously, theoretically guaranteeing the global resolution of direction conflicts. GradCraft ensures the concurrent achievement of appropriate magnitude balance and global direction balance, aligning with the inherent characteristics of recommendation scenarios. Both offline and online experiments attest to the efficacy of GradCraft in enhancing multi-task performance in recommendations. The source code for GradCraft can be accessed at https://github.com/baiyimeng/GradCraft.

GradCraft: Elevating Multi-task Recommendations through Holistic Gradient Crafting

TL;DR

The target of multi-task learning is set as attaining the appropriate magnitude balance and the global direction balance, and an innovative methodology named GradCraft is proposed in response, ensuring the concurrent achievement of appropriate magnitude balance and global direction balance.

Abstract

Recommender systems require the simultaneous optimization of multiple objectives to accurately model user interests, necessitating the application of multi-task learning methods. However, existing multi-task learning methods in recommendations overlook the specific characteristics of recommendation scenarios, falling short in achieving proper gradient balance. To address this challenge, we set the target of multi-task learning as attaining the appropriate magnitude balance and the global direction balance, and propose an innovative methodology named GradCraft in response. GradCraft dynamically adjusts gradient magnitudes to align with the maximum gradient norm, mitigating interference from gradient magnitudes for subsequent manipulation. It then employs projections to eliminate gradient conflicts in directions while considering all conflicting tasks simultaneously, theoretically guaranteeing the global resolution of direction conflicts. GradCraft ensures the concurrent achievement of appropriate magnitude balance and global direction balance, aligning with the inherent characteristics of recommendation scenarios. Both offline and online experiments attest to the efficacy of GradCraft in enhancing multi-task performance in recommendations. The source code for GradCraft can be accessed at https://github.com/baiyimeng/GradCraft.
Paper Structure (27 sections, 15 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 15 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: An overview of GradCraft. It initially adjusts the gradient magnitude based on the maximum norm. Subsequently, it performs gradient projections based on the conflicting task gradients and aggregates the gradients to update the recommender model, globally deconflicting in directions.
  • Figure 2: Results of the performance of GradCraft across different values of $\tau$ on Wechat.
  • Figure 3: Results of the performance of GradCraft across different values of $\epsilon$ on Wechat.
  • Figure 4: Results of the performance of GradCraft in comparison with the best baseline across different task number $T$ on Wechat.