Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval
Warren Jouanneau, Marc Palyart, Emma Jouffroy
TL;DR
This work tackles scalable multilingual skill matching between job proposals and freelancer profiles by introducing a structure-aware two-tower retriever built on a frozen multilingual backbone. It encodes documents at the section level, incorporates section-type awareness, and adds a document-level transformer head to produce aligned embeddings trained with contrastive losses, including InfoNCE and adjacency-based variants. The approach outperforms baselines on retrieval-quality metrics while preserving language alignment, and its production deployment in a vector-store-based pipeline yields lower latency and higher conversion for effective matches. The results demonstrate that preserving document structure and leveraging historical interactions in a multilingual setting can significantly improve scalable candidate retrieval in global marketplaces.
Abstract
Finding the perfect match between a job proposal and a set of freelancers is not an easy task to perform at scale, especially in multiple languages. In this paper, we propose a novel neural retriever architecture that tackles this problem in a multilingual setting. Our method encodes project descriptions and freelancer profiles by leveraging pre-trained multilingual language models. The latter are used as backbone for a custom transformer architecture that aims to keep the structure of the profiles and project. This model is trained with a contrastive loss on historical data. Thanks to several experiments, we show that this approach effectively captures skill matching similarity and facilitates efficient matching, outperforming traditional methods.
