AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening
Frank P. -W. Lo, Jianing Qiu, Zeyu Wang, Haibao Yu, Yeming Chen, Gao Zhang, Benny Lo
TL;DR
We address the challenge of high-volume, objective resume screening by proposing a context-aware, multi-agent framework that uses Retrieval-Augmented Generation (RAG) to dynamically tailor evaluations to each job role. The system comprises four agents—resume extractor, evaluator, summarizer, and score formatter—to parse resumes, score against role-specific criteria, generate explainable feedback, and output standardized results; the evaluator retrieves external knowledge via RAG, enabling $S^J=\{S_S^J,S_K^J,S_W^J,S_B^J,S_E^J\}$ with $S_{final}^J=\sum_i w_i S_i^J$ without fine-tuning the model. Experimental results on 105 anonymized resumes show that multi-agent RAG-LLMs (e.g., with DeepSeek-V3) achieve higher $PC_{10}$ and $SC_{10}$ and lower $MAE$ than single-model baselines, with $PC_{10}=0.84$, $SC_{10}=0.74$, and $MAE=0.90$, and strong alignment with HR assessments. The framework improves transparency and adaptability by decoupling extraction, evaluation, and feedback, and it can adapt to diverse hiring criteria across industries without retraining, demonstrating practical impact for scalable, fair, and explainable AI-driven hiring.
Abstract
Resume screening is a critical yet time-intensive process in talent acquisition, requiring recruiters to analyze vast volume of job applications while remaining objective, accurate, and fair. With the advancements in Large Language Models (LLMs), their reasoning capabilities and extensive knowledge bases demonstrate new opportunities to streamline and automate recruitment workflows. In this work, we propose a multi-agent framework for resume screening using LLMs to systematically process and evaluate resumes. The framework consists of four core agents, including a resume extractor, an evaluator, a summarizer, and a score formatter. To enhance the contextual relevance of candidate assessments, we integrate Retrieval-Augmented Generation (RAG) within the resume evaluator, allowing incorporation of external knowledge sources, such as industry-specific expertise, professional certifications, university rankings, and company-specific hiring criteria. This dynamic adaptation enables personalized recruitment, bridging the gap between AI automation and talent acquisition. We assess the effectiveness of our approach by comparing AI-generated scores with ratings provided by HR professionals on a dataset of anonymized online resumes. The findings highlight the potential of multi-agent RAG-LLM systems in automating resume screening, enabling more efficient and scalable hiring workflows.
