Table of Contents
Fetching ...

LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation

Zhoutong Fu, Yihan Cao, Yi-Lin Chen, Aman Lunia, Liming Dong, Neha Saraf, Ruijie Jiang, Yun Dai, Qingquan Song, Tan Wang, Guoyao Li, Derek Koh, Haichao Wei, Zhipeng Wang, Aman Gupta, Chengming Jiang, Jianqiang Shen, Liangjie Hong, Wenjing Zhang

TL;DR

LANTERN tackles the challenge of deploying domain-specific LLMs for job-person fit and explanation at scale by distilling a powerful black-box teacher into two lightweight student models: an encoder-based classifier and a decoder-based explainer. The framework combines data- and logit-level knowledge distillation, plus post-training techniques and prompt engineering, to deliver high-quality explanations and accurate fit ratings with low latency. Offline results show improvements in ROUGE metrics for explanations and higher accuracy in classification, while online deployment yields measurable gains in apply rate and qualified applications at LinkedIn. The work provides practical guidelines for synthetic data generation, multi-stage distillation, and production-serving optimizations, offering a scalable blueprint for domain-specific, interpretable LLM-based systems.

Abstract

Large language models (LLMs) have achieved strong performance across a wide range of natural language processing tasks. However, deploying LLMs at scale for domain specific applications, such as job-person fit and explanation in job seeking platforms, introduces distinct challenges. At LinkedIn, the job person fit task requires analyzing a candidate's public profile against job requirements to produce both a fit assessment and a detailed explanation. Directly applying open source or finetuned LLMs to this task often fails to yield high quality, actionable feedback due to the complexity of the domain and the need for structured outputs. Moreover, the large size of these models leads to high inference latency and limits scalability, making them unsuitable for online use. To address these challenges, we introduce LANTERN, a novel LLM knowledge distillation framework tailored specifically for job person fit tasks. LANTERN involves modeling over multiple objectives, an encoder model for classification purpose, and a decoder model for explanation purpose. To better distill the knowledge from a strong black box teacher model to multiple downstream models, LANTERN incorporates multi level knowledge distillation that integrates both data and logit level insights. In addition to introducing the knowledge distillation framework, we share our insights on post training techniques and prompt engineering, both of which are crucial for successfully adapting LLMs to domain specific downstream tasks. Extensive experimental results demonstrate that LANTERN significantly improves task specific metrics for both job person fit and explanation. Online evaluations further confirm its effectiveness, showing measurable gains in job seeker engagement, including a 0.24\% increase in apply rate and a 0.28\% increase in qualified applications.

LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation

TL;DR

LANTERN tackles the challenge of deploying domain-specific LLMs for job-person fit and explanation at scale by distilling a powerful black-box teacher into two lightweight student models: an encoder-based classifier and a decoder-based explainer. The framework combines data- and logit-level knowledge distillation, plus post-training techniques and prompt engineering, to deliver high-quality explanations and accurate fit ratings with low latency. Offline results show improvements in ROUGE metrics for explanations and higher accuracy in classification, while online deployment yields measurable gains in apply rate and qualified applications at LinkedIn. The work provides practical guidelines for synthetic data generation, multi-stage distillation, and production-serving optimizations, offering a scalable blueprint for domain-specific, interpretable LLM-based systems.

Abstract

Large language models (LLMs) have achieved strong performance across a wide range of natural language processing tasks. However, deploying LLMs at scale for domain specific applications, such as job-person fit and explanation in job seeking platforms, introduces distinct challenges. At LinkedIn, the job person fit task requires analyzing a candidate's public profile against job requirements to produce both a fit assessment and a detailed explanation. Directly applying open source or finetuned LLMs to this task often fails to yield high quality, actionable feedback due to the complexity of the domain and the need for structured outputs. Moreover, the large size of these models leads to high inference latency and limits scalability, making them unsuitable for online use. To address these challenges, we introduce LANTERN, a novel LLM knowledge distillation framework tailored specifically for job person fit tasks. LANTERN involves modeling over multiple objectives, an encoder model for classification purpose, and a decoder model for explanation purpose. To better distill the knowledge from a strong black box teacher model to multiple downstream models, LANTERN incorporates multi level knowledge distillation that integrates both data and logit level insights. In addition to introducing the knowledge distillation framework, we share our insights on post training techniques and prompt engineering, both of which are crucial for successfully adapting LLMs to domain specific downstream tasks. Extensive experimental results demonstrate that LANTERN significantly improves task specific metrics for both job person fit and explanation. Online evaluations further confirm its effectiveness, showing measurable gains in job seeker engagement, including a 0.24\% increase in apply rate and a 0.28\% increase in qualified applications.

Paper Structure

This paper contains 43 sections, 9 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The illustration of how the AI model assists job seekers by evaluating job–person fit. When a user expresses interest in a job, the model analyzes the posting and recommends one of two actions: apply if the fit is high, or pursue a next-best alternative if the fit is low. User behavior, such as whether they apply or not, provides implicit feedback, which is used to further refine the model’s recommendations over time.
  • Figure 2: Illustration of how LANTERN is integrated into the LinkedIn Job Board. When a user views a job description, the classification model predicts a fit label (High/Medium/Low), which is shown alongside the job. Upon clicking “Show Reason,” the explanation model generates a detailed, personalized rationale, helping the user understand why they are a good match for the job.
  • Figure 3: LANTERN structure. The framework consists of two stages: in-house model training (left) and knowledge distillation (right). In the first stage, member information and job descriptions are collected to form a base dataset $D_{\text{base}}$, which is used to prompt a proprietary teacher model $T_1$. The generated outputs are curated via human-in-the-loop filtering to form a high-quality seed dataset $D_{\text{seed}}$ for finetuning an in-house teacher model $T_2$. In the second stage, $T_2$ is used to generate synthetic data $D_t$ for distilling an explanation model and corresponding classification data $D_{\text{cls}}$ for training a classification model. Both models are supervised via data-level knowledge distillation and are deployed for online serving.
  • Figure 4: LANTERN distillation structure.