Mitigating Language Bias in Cross-Lingual Job Retrieval: A Recruitment Platform Perspective
Napat Laosaengpha, Thanit Tativannarat, Attapol Rutherford, Ekapol Chuangsuwanich
TL;DR
This work tackles cross-lingual job retrieval on recruitment platforms by mitigating language bias in bilingual sentence representations. It introduces a Thai–English multi-task dual-encoder that jointly learns representations for job titles, descriptions, and fields via three tasks, leveraging label-free postings. A novel Language Bias Kullback–Leibler Divergence (LBKL) metric is proposed to quantify bias in retrieval, and the model achieves state-of-the-art cross-lingual performance with a smaller footprint. Empirical results on JTG-Synonym and JTG-Occupation demonstrate significant bias reduction and stronger cross-lingual retrieval, suggesting practical improvements for multilingual recruitment systems and bias-aware evaluation in retrieval models. The approach offers a scalable framework for bias-aware cross-lingual information extraction in domain-specific, low-resource languages.
Abstract
Understanding the textual components of resumes and job postings is critical for improving job-matching accuracy and optimizing job search systems in online recruitment platforms. However, existing works primarily focus on analyzing individual components within this information, requiring multiple specialized tools to analyze each aspect. Such disjointed methods could potentially hinder overall generalizability in recruitment-related text processing. Therefore, we propose a unified sentence encoder that utilized multi-task dual-encoder framework for jointly learning multiple component into the unified sentence encoder. The results show that our method outperforms other state-of-the-art models, despite its smaller model size. Moreover, we propose a novel metric, Language Bias Kullback-Leibler Divergence (LBKL), to evaluate language bias in the encoder, demonstrating significant bias reduction and superior cross-lingual performance.
