Table of Contents
Fetching ...

Dual Collaborative LLMs via Continual Fine-Tuning for Serendipitous Recommendation

Hongxiang Lin, Hao Guo, Zeshun Li, Erpeng Xue, Yongqian He, Xiangyu Hou, Zhaoyu Hu, Lei Wang, Sheng Chen

TL;DR

CoEA tackles serendipitous recommendation by coupling Dual-Stable Interest Exploration of long-term group identity with short-term interests with a Periodic Collaborative Optimization loop that continuously refines Novelty and Relevance LLMs. The system uses RQ-VAE to form Collaborative Semantic IDs, a Profile LLM to describe group personas, and a causal self-attention framework to model behavior sequences, all backed by offline storage to support scalable online serving. Empirical results on Movielens-1M and MTRec show improved quality and novelty, with significant online gains in GTV and 7D-NIEP, and ablations confirm the critical roles of DSIE and PCO in sustaining dual-model optimization. The work demonstrates practical impact on industrial-scale platforms by enabling dynamic, data-driven serendipitous recommendations through a tightly integrated LLM-based framework.

Abstract

Traditional recommendation systems tend to trap users in strong feedback loops by excessively pushing content aligned with their historical preferences, thereby limiting exploration opportunities and causing content fatigue. Although large language models (LLMs) demonstrate potential with their diverse content generation capabilities, existing LLM-enhanced dual-model frameworks face two major limitations: first, they overlook long-term preferences driven by group identity, leading to biased interest modeling; second, they suffer from static optimization flaws, as a one-time alignment process fails to leverage incremental user data for closed-loop optimization. To address these challenges, we propose the Co-Evolutionary Alignment (CoEA) method. For interest modeling bias, we introduce Dual-Stable Interest Exploration (DSIE) module, jointly modeling long-term group identity and short-term individual interests through parallel processing of behavioral sequences. For static optimization limitations, we design a Periodic Collaborative Optimization (PCO) mechanism. This mechanism regularly conducts preference verification on incremental data using the Relevance LLM, then guides the Novelty LLM to perform fine-tuning based on the verification results, and subsequently feeds back the output of the continually fine-tuned Novelty LLM to the Relevance LLM for re-evaluation, thereby achieving a dynamic closed-loop optimization. Extensive online and offline experiments verify the effectiveness of the CoEA model in serendipitous recommendation.

Dual Collaborative LLMs via Continual Fine-Tuning for Serendipitous Recommendation

TL;DR

CoEA tackles serendipitous recommendation by coupling Dual-Stable Interest Exploration of long-term group identity with short-term interests with a Periodic Collaborative Optimization loop that continuously refines Novelty and Relevance LLMs. The system uses RQ-VAE to form Collaborative Semantic IDs, a Profile LLM to describe group personas, and a causal self-attention framework to model behavior sequences, all backed by offline storage to support scalable online serving. Empirical results on Movielens-1M and MTRec show improved quality and novelty, with significant online gains in GTV and 7D-NIEP, and ablations confirm the critical roles of DSIE and PCO in sustaining dual-model optimization. The work demonstrates practical impact on industrial-scale platforms by enabling dynamic, data-driven serendipitous recommendations through a tightly integrated LLM-based framework.

Abstract

Traditional recommendation systems tend to trap users in strong feedback loops by excessively pushing content aligned with their historical preferences, thereby limiting exploration opportunities and causing content fatigue. Although large language models (LLMs) demonstrate potential with their diverse content generation capabilities, existing LLM-enhanced dual-model frameworks face two major limitations: first, they overlook long-term preferences driven by group identity, leading to biased interest modeling; second, they suffer from static optimization flaws, as a one-time alignment process fails to leverage incremental user data for closed-loop optimization. To address these challenges, we propose the Co-Evolutionary Alignment (CoEA) method. For interest modeling bias, we introduce Dual-Stable Interest Exploration (DSIE) module, jointly modeling long-term group identity and short-term individual interests through parallel processing of behavioral sequences. For static optimization limitations, we design a Periodic Collaborative Optimization (PCO) mechanism. This mechanism regularly conducts preference verification on incremental data using the Relevance LLM, then guides the Novelty LLM to perform fine-tuning based on the verification results, and subsequently feeds back the output of the continually fine-tuned Novelty LLM to the Relevance LLM for re-evaluation, thereby achieving a dynamic closed-loop optimization. Extensive online and offline experiments verify the effectiveness of the CoEA model in serendipitous recommendation.

Paper Structure

This paper contains 44 sections, 23 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison between current and our methods.
  • Figure 2: The framework of our CoEA method.
  • Figure 3: Heatmaps of inter-group and intra-group user similarity for the Movielens-1M and MTRec datasets.
  • Figure 4: Quality and Novelty Metrics Across Fine-tuning Rounds: CoEA vs. CoEA (w/o KL) in Movielens-1M dataset.
  • Figure 5: Deployment architecture of CoEA in online recommendation system.
  • ...and 2 more figures