Two Tickets are Better than One: Fair and Accurate Hiring Under Strategic LLM Manipulations
Lee Cohen, Jack Hsieh, Connie Hong, Judy Hanwen Shen
TL;DR
This work analyzes fairness and accuracy in hiring under stochastic resume manipulations by large language models. It introduces a two-ticket scheme where the Hirer applies an additional LLM-based manipulation and evaluates both the original and manipulated resumes, providing theoretical guarantees under a No False Positives objective. The approach generalizes to an $n$-ticket scheme, showing convergence to a group-independent decision as $n \to \infty$, thereby mitigating disparities due to unequal LLM access. Empirical validation on real resumes demonstrates improvements in true positive rate and reductions in group disparities, supporting the method’s potential to alleviate fairness gaps in automated hiring while highlighting deployment considerations and limitations.
Abstract
In an era of increasingly capable foundation models, job seekers are turning to generative AI tools to enhance their application materials. However, unequal access to and knowledge about generative AI tools can harm both employers and candidates by reducing the accuracy of hiring decisions and giving some candidates an unfair advantage. To address these challenges, we introduce a new variant of the strategic classification framework tailored to manipulations performed using large language models, accommodating varying levels of manipulations and stochastic outcomes. We propose a ``two-ticket'' scheme, where the hiring algorithm applies an additional manipulation to each submitted resume and considers this manipulated version together with the original submitted resume. We establish theoretical guarantees for this scheme, showing improvements for both the fairness and accuracy of hiring decisions when the true positive rate is maximized subject to a no false positives constraint. We further generalize this approach to an $n$-ticket scheme and prove that hiring outcomes converge to a fixed, group-independent decision, eliminating disparities arising from differential LLM access. Finally, we empirically validate our framework and the performance of our two-ticket scheme on real resumes using an open-source resume screening tool.
