Bursting Filter Bubble: Enhancing Serendipity Recommendations with Aligned Large Language Models
Yunjia Xi, Muyan Weng, Wen Chen, Chao Yi, Dian Chen, Gaoyang Guo, Mao Zhang, Jian Wu, Yuning Jiang, Qingwen Liu, Yong Yu, Weinan Zhang
TL;DR
The paper tackles the filter bubble problem in recommender systems by introducing SERAL, a three-stage framework that leverages aligned large language models to deliver serendipity without sacrificing utility. It combines Cognition Profile Generation to compress long user histories, SerenGPT Alignment with Collaborative Data Intervention to align serendipity judgments to human preferences, and Nearline Adaptation to enable efficient industrial deployment via a serendipity channel and caching. Empirical results in Taobao show consistent gains in serendipity exposure, clicks, and transactions with minimal impact on overall revenue, and ablations confirm the importance of IPO-based preference alignment and CDI for data quality and diversity. The work demonstrates the practical viability of using aligned LLMs for serendipity in large-scale RSs and provides a deployable blueprint for breaking the filter bubble in industry-scale systems.
Abstract
Recommender systems (RSs) often suffer from the feedback loop phenomenon, e.g., RSs are trained on data biased by their recommendations. This leads to the filter bubble effect that reinforces homogeneous content and reduces user satisfaction. To this end, serendipity recommendations, which offer unexpected yet relevant items, are proposed. Recently, large language models (LLMs) have shown potential in serendipity prediction due to their extensive world knowledge and reasoning capabilities. However, they still face challenges in aligning serendipity judgments with human assessments, handling long user behavior sequences, and meeting the latency requirements of industrial RSs. To address these issues, we propose SERAL (Serendipity Recommendations with Aligned Large Language Models), a framework comprising three stages: (1) Cognition Profile Generation to compress user behavior into multi-level profiles; (2) SerenGPT Alignment to align serendipity judgments with human preferences using enriched training data; and (3) Nearline Adaptation to integrate SerenGPT into industrial RSs pipelines efficiently. Online experiments demonstrate that SERAL improves exposure ratio (PVR), clicks, and transactions of serendipitous items by 5.7%, 29.56%, and 27.6%, enhancing user experience without much impact on overall revenue. Now, it has been fully deployed in the "Guess What You Like" of the Taobao App homepage.
