Fake Resume Attacks: Data Poisoning on Online Job Platforms
Michiharu Yamashita, Thanh Tran, Dongwon Lee
TL;DR
This work reveals a vulnerability in online job platforms where data poisoning through fake resumes can skew career-prediction matchmaking. It introduces FRANCIS, an end-to-end framework composed of a probabilistic trajectory generator, reality regulation, an attack module, a target-focused objective function, and a surrogate model to credibly craft fake resumes. Across Tech and Business datasets, FRANCIS markedly degrades or elevates target predictions, achieving substantial improvement rates at modest injection levels and outperforming several baselines, including GPT-4 and DQN. The findings highlight practical risks to both job seekers and employers and call for defense mechanisms to safeguard HR workflows in online platforms.
Abstract
While recent studies have exposed various vulnerabilities incurred from data poisoning attacks in many web services, little is known about the vulnerability on online professional job platforms (e.g., LinkedIn and Indeed). In this work, first time, we demonstrate the critical vulnerabilities found in the common Human Resources (HR) task of matching job seekers and companies on online job platforms. Capitalizing on the unrestricted format and contents of job seekers' resumes and easy creation of accounts on job platforms, we demonstrate three attack scenarios: (1) company promotion attack to increase the likelihood of target companies being recommended, (2) company demotion attack to decrease the likelihood of target companies being recommended, and (3) user promotion attack to increase the likelihood of certain users being matched to certain companies. To this end, we develop an end-to-end "fake resume" generation framework, titled FRANCIS, that induces systematic prediction errors via data poisoning. Our empirical evaluation on real-world datasets reveals that data poisoning attacks can markedly skew the results of matchmaking between job seekers and companies, regardless of underlying models, with vulnerability amplified in proportion to poisoning intensity. These findings suggest that the outputs of various services from job platforms can be potentially hacked by malicious users.
