TAROT: A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit
Yihan Cao, Xu Chen, Lun Du, Hao Chen, Qiang Fu, Shi Han, Yushu Du, Yanbin Kang, Guangming Lu, Zi Li
TL;DR
This work addresses person-job fit on recruitment platforms by exploiting semi-structured profiles and job descriptions, where general domain LLMs underperform due to domain-specific structure. It introduces TAROT, a hierarchical structured language model with multitask co-pretraining across sentence-, section-, individual-, and interaction-level tasks, including $L_{MLM}$, $L_{Exp}$, $L_{Att}$, and $L_{App}$, trained via the joint objective $L=\sum_{*}\lambda_{*}L_{*}$. Key contributions are the architecture that integrates attention fusion and cross-attention to fuse user and job semantics, and the four targeted pretraining tasks that align representations with domain structure. Experiments on a real LinkedIn dataset show significant gains in both job and candidate recommendation tasks, demonstrating the practical value of structure-aware embeddings for online recruitment services.
Abstract
Person-job fit is an essential part of online recruitment platforms in serving various downstream applications like Job Search and Candidate Recommendation. Recently, pretrained large language models have further enhanced the effectiveness by leveraging richer textual information in user profiles and job descriptions apart from user behavior features and job metadata. However, the general domain-oriented design struggles to capture the unique structural information within user profiles and job descriptions, leading to a loss of latent semantic correlations. We propose TAROT, a hierarchical multitask co-pretraining framework, to better utilize structural and semantic information for informative text embeddings. TAROT targets semi-structured text in profiles and jobs, and it is co-pretained with multi-grained pretraining tasks to constrain the acquired semantic information at each level. Experiments on a real-world LinkedIn dataset show significant performance improvements, proving its effectiveness in person-job fit tasks.
