Table of Contents
Fetching ...

An Industrial-Scale Sequential Recommender for LinkedIn Feed Ranking

Lars Hertel, Gaurav Srivastava, Syed Ali Naqvi, Satyam Kumar, Yue Zhang, Borja Ocejo, Benjamin Zelditch, Adrian Englhardt, Hailing Cheng, Andy Hu, Antonio Alonso, Daming Li, Siddharth Dangi, Chen Zhu, Mingzhou Zhou, Wanning Li, Tao Huang, Fedor Borisyuk, Ganesh Parameswaran, Birjodh Singh Tiwana, Sriram Sankar, Qing Lan, Julie Choi, Souvik Ghosh

TL;DR

Feed SR presents a scalable transformer-based sequential recommender for LinkedIn Feed that handles long histories and billions of posts under stringent production constraints. By interleaving posts and actions, employing RoPE-based positionalEncoding, late fusion of context features, and a multi-task MMoE head, the model achieves robust online gains while meeting latency/throughput targets. The authors introduce training techniques (IPW, incremental training, temporal/positional weighting), address in-session leakage, and provide a comprehensive system architecture with disaggregated CPU-GPU inference and specialized kernels (Shared Context Batching, SRMIS) for efficiency. Online results show a $+2.10\%$ increase in time spent, and the work outlines deployment lessons and energy considerations that demonstrate production viability at scale.

Abstract

LinkedIn Feed enables professionals worldwide to discover relevant content, build connections, and share knowledge at scale. We present Feed Sequential Recommender (Feed-SR), a transformer-based sequential ranking model for LinkedIn Feed that replaces a DCNv2-based ranker and meets strict production constraints. We detail the modeling choices, training techniques, and serving optimizations that enable deployment at LinkedIn scale. Feed-SR is currently the primary member experience on LinkedIn's Feed and shows significant improvements in member engagement (+2.10% time spent) in online A/B tests compared to the existing production model. We also describe our deployment experience with alternative sequential and LLM-based ranking architectures and why Feed-SR provided the best combination of online metrics and production efficiency.

An Industrial-Scale Sequential Recommender for LinkedIn Feed Ranking

TL;DR

Feed SR presents a scalable transformer-based sequential recommender for LinkedIn Feed that handles long histories and billions of posts under stringent production constraints. By interleaving posts and actions, employing RoPE-based positionalEncoding, late fusion of context features, and a multi-task MMoE head, the model achieves robust online gains while meeting latency/throughput targets. The authors introduce training techniques (IPW, incremental training, temporal/positional weighting), address in-session leakage, and provide a comprehensive system architecture with disaggregated CPU-GPU inference and specialized kernels (Shared Context Batching, SRMIS) for efficiency. Online results show a increase in time spent, and the work outlines deployment lessons and energy considerations that demonstrate production viability at scale.

Abstract

LinkedIn Feed enables professionals worldwide to discover relevant content, build connections, and share knowledge at scale. We present Feed Sequential Recommender (Feed-SR), a transformer-based sequential ranking model for LinkedIn Feed that replaces a DCNv2-based ranker and meets strict production constraints. We detail the modeling choices, training techniques, and serving optimizations that enable deployment at LinkedIn scale. Feed-SR is currently the primary member experience on LinkedIn's Feed and shows significant improvements in member engagement (+2.10% time spent) in online A/B tests compared to the existing production model. We also describe our deployment experience with alternative sequential and LLM-based ranking architectures and why Feed-SR provided the best combination of online metrics and production efficiency.
Paper Structure (59 sections, 4 equations, 12 figures, 5 tables)

This paper contains 59 sections, 4 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: The Feed SR model architecture.
  • Figure 2: Scaling of Long Dwell AUC as a function of training FLOPS for Feed SR. Baseline is the current Feed production model.
  • Figure 3: System Architecture of Feed SR
  • Figure 4: Scaling of Contributions AUC as a function of training FLOPS for Feed SR.
  • Figure 5: Scaling of normalized evaluation entropy as a function of training FLOPS for Feed SR.
  • ...and 7 more figures