Table of Contents
Fetching ...

LiRank: Industrial Large Scale Ranking Models at LinkedIn

Fedor Borisyuk, Mingzhou Zhou, Qingquan Song, Siyu Zhu, Birjodh Tiwana, Ganesh Parameswaran, Siddharth Dangi, Lars Hertel, Qiang Xiao, Xiaochen Hou, Yunbo Ouyang, Aman Gupta, Sheallika Singh, Dan Liu, Hailing Cheng, Lei Le, Jonathan Hung, Sathiya Keerthi, Ruoyan Wang, Fengyu Zhang, Mohit Kothari, Chen Zhu, Daqi Sun, Yun Dai, Xun Luan, Sirou Zhu, Zhiwei Wang, Neil Daftary, Qianqi Shen, Chengming Jiang, Haichao Wei, Maneesh Varshney, Amol Ghoting, Souvik Ghosh

TL;DR

LiRank presents a production-ready framework for large-scale ranking at LinkedIn by integrating Residual DCN, isotonic calibration, Dense Gating, TransAct, and incremental training into a unified system. It provides modeling solutions for Feed ranking, Ads CTR, and Job recommendations, coupled with training scalability and model compression techniques (QR hashing, embedding quantization) to enable deployment at scale. Empirical results show production gains across domains, including +0.5% in Feed sessions, +1.76% in Qualified Applications for Jobs, and +4.3% CTR for Ads, underscoring practical impact. The work offers concrete guidance on calibration, exploration–exploitation in production, and deployable strategies for training, compression, and serving of large ranking models.

Abstract

We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods. We unveil several modeling improvements, including Residual DCN, which adds attention and residual connections to the famous DCNv2 architecture. We share insights into combining and tuning SOTA architectures to create a unified model, including Dense Gating, Transformers and Residual DCN. We also propose novel techniques for calibration and describe how we productionalized deep learning based explore/exploit methods. To enable effective, production-grade serving of large ranking models, we detail how to train and compress models using quantization and vocabulary compression. We provide details about the deployment setup for large-scale use cases of Feed ranking, Jobs Recommendations, and Ads click-through rate (CTR) prediction. We summarize our learnings from various A/B tests by elucidating the most effective technical approaches. These ideas have contributed to relative metrics improvements across the board at LinkedIn: +0.5% member sessions in the Feed, +1.76% qualified job applications for Jobs search and recommendations, and +4.3% for Ads CTR. We hope this work can provide practical insights and solutions for practitioners interested in leveraging large-scale deep ranking systems.

LiRank: Industrial Large Scale Ranking Models at LinkedIn

TL;DR

LiRank presents a production-ready framework for large-scale ranking at LinkedIn by integrating Residual DCN, isotonic calibration, Dense Gating, TransAct, and incremental training into a unified system. It provides modeling solutions for Feed ranking, Ads CTR, and Job recommendations, coupled with training scalability and model compression techniques (QR hashing, embedding quantization) to enable deployment at scale. Empirical results show production gains across domains, including +0.5% in Feed sessions, +1.76% in Qualified Applications for Jobs, and +4.3% CTR for Ads, underscoring practical impact. The work offers concrete guidance on calibration, exploration–exploitation in production, and deployable strategies for training, compression, and serving of large ranking models.

Abstract

We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods. We unveil several modeling improvements, including Residual DCN, which adds attention and residual connections to the famous DCNv2 architecture. We share insights into combining and tuning SOTA architectures to create a unified model, including Dense Gating, Transformers and Residual DCN. We also propose novel techniques for calibration and describe how we productionalized deep learning based explore/exploit methods. To enable effective, production-grade serving of large ranking models, we detail how to train and compress models using quantization and vocabulary compression. We provide details about the deployment setup for large-scale use cases of Feed ranking, Jobs Recommendations, and Ads click-through rate (CTR) prediction. We summarize our learnings from various A/B tests by elucidating the most effective technical approaches. These ideas have contributed to relative metrics improvements across the board at LinkedIn: +0.5% member sessions in the Feed, +1.76% qualified job applications for Jobs search and recommendations, and +4.3% for Ads CTR. We hope this work can provide practical insights and solutions for practitioners interested in leveraging large-scale deep ranking systems.
Paper Structure (39 sections, 4 equations, 10 figures, 10 tables)

This paper contains 39 sections, 4 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Contribution tower of the main Feed ranking model
  • Figure 2: Ads CTR chargeability-based multi-task model
  • Figure 3: Residual Cross Network
  • Figure 4: Isotonic layer representation
  • Figure 5: RO Wide model on click and like towers.
  • ...and 5 more figures