More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

Tong Piao; Pei Tang; Zhipeng Zhang; Jiaqi Li; Qiao Liu; Zufeng Wu

More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

Tong Piao, Pei Tang, Zhipeng Zhang, Jiaqi Li, Qiao Liu, Zufeng Wu

TL;DR

This paper investigates how diverse, multi-task data can improve LLM domain adaptation in e-commerce by introducing a ShopBench-based framework and evaluating capability- and task-level diversity. It designs a broad set of tasks aligned with e-commerce logic and employs data from open sources and LLM-generated sources, with LoRA rank used to balance tuning capacity against data diversity. Empirical results show that increasing both data diversity and tuning capacity yields cumulative gains, and the best model achieves a top-5 finish in KDD Cup 2024 Task 1, validating the approach. The study provides actionable guidance for constructing diverse, capable e-commerce LLMs and highlights future directions, including alternative capacity-expansion methods like MMoE and multi-LoRA.

Abstract

In recent years, Large Language Models (LLMs) have been widely applied across various domains due to their powerful domain adaptation capabilities. Previous studies have suggested that diverse, multi-modal data can enhance LLMs' domain adaptation performance. However, this hypothesis remains insufficiently validated in the e-commerce sector. To address this gap, we propose a comprehensive e-commerce multi-task framework and design empirical experiments to examine the impact of diverse data and tasks on LLMs from two perspectives: "capability comprehensiveness" and "task comprehensiveness." Specifically, we observe significant improvements in LLM performance by progressively introducing tasks related to new major capability areas and by continuously adding subtasks within different major capability domains. Furthermore, we observe that increasing model capacity amplifies the benefits of diversity, suggesting a synergistic relationship between model capacity and data diversity. Finally, we validate the best-performing model from our empirical experiments in the KDD Cup 2024, achieving a rank 5 in Task 1. This outcome demonstrates the significance of our research for advancing LLMs in the e-commerce domain.

More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

TL;DR

Abstract

More diverse more adaptive: Comprehensive Multi-task Learning for Improved LLM Domain Adaptation in E-commerce

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)