Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring
Buse Sibel Korkmaz, Rahul Nair, Elizabeth M. Daly, Evangelos Anagnostopoulos, Christos Varytimidis, Antonio del Rio Chanona
TL;DR
This work introduces AutoRefine, an offline reinforcement-learning-based framework that fine-tunes foundation models for task-focused outputs without human feedback, targeting fairness in algorithmic hiring. By coupling a fine-tuned base model with a perturbation regulator and a task-specific evaluator, AutoRefine optimizes outputs through a three-stage process (align, perturb, deploy) using ILQL to maximize diversity-oriented rewards. Applied to job-description rewriting, the approach reduces bias in candidate matching while preserving or improving recommendation quality, as demonstrated on Hacker News, Bias in Bios, and a live hiring platform dataset. The method offers a scalable pathway for governance-aligned language generation, enabling measurable improvements in fairness metrics with real-world hiring impact, albeit with computational costs and careful attention to evaluation proxies and potential hallucinations.
Abstract
Foundation models require fine-tuning to ensure their generative outputs align with intended results for specific tasks. Automating this fine-tuning process is challenging, as it typically needs human feedback that can be expensive to acquire. We present AutoRefine, a method that leverages reinforcement learning for targeted fine-tuning, utilizing direct feedback from measurable performance improvements in specific downstream tasks. We demonstrate the method for a problem arising in algorithmic hiring platforms where linguistic biases influence a recommendation system. In this setting, a generative model seeks to rewrite given job specifications to receive more diverse candidate matches from a recommendation engine which matches jobs to candidates. Our model detects and regulates biases in job descriptions to meet diversity and fairness criteria. The experiments on a public hiring dataset and a real-world hiring platform showcase how large language models can assist in identifying and mitigation biases in the real world.
