Table of Contents
Fetching ...

AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening

Frank P. -W. Lo, Jianing Qiu, Zeyu Wang, Haibao Yu, Yeming Chen, Gao Zhang, Benny Lo

TL;DR

We address the challenge of high-volume, objective resume screening by proposing a context-aware, multi-agent framework that uses Retrieval-Augmented Generation (RAG) to dynamically tailor evaluations to each job role. The system comprises four agents—resume extractor, evaluator, summarizer, and score formatter—to parse resumes, score against role-specific criteria, generate explainable feedback, and output standardized results; the evaluator retrieves external knowledge via RAG, enabling $S^J=\{S_S^J,S_K^J,S_W^J,S_B^J,S_E^J\}$ with $S_{final}^J=\sum_i w_i S_i^J$ without fine-tuning the model. Experimental results on 105 anonymized resumes show that multi-agent RAG-LLMs (e.g., with DeepSeek-V3) achieve higher $PC_{10}$ and $SC_{10}$ and lower $MAE$ than single-model baselines, with $PC_{10}=0.84$, $SC_{10}=0.74$, and $MAE=0.90$, and strong alignment with HR assessments. The framework improves transparency and adaptability by decoupling extraction, evaluation, and feedback, and it can adapt to diverse hiring criteria across industries without retraining, demonstrating practical impact for scalable, fair, and explainable AI-driven hiring.

Abstract

Resume screening is a critical yet time-intensive process in talent acquisition, requiring recruiters to analyze vast volume of job applications while remaining objective, accurate, and fair. With the advancements in Large Language Models (LLMs), their reasoning capabilities and extensive knowledge bases demonstrate new opportunities to streamline and automate recruitment workflows. In this work, we propose a multi-agent framework for resume screening using LLMs to systematically process and evaluate resumes. The framework consists of four core agents, including a resume extractor, an evaluator, a summarizer, and a score formatter. To enhance the contextual relevance of candidate assessments, we integrate Retrieval-Augmented Generation (RAG) within the resume evaluator, allowing incorporation of external knowledge sources, such as industry-specific expertise, professional certifications, university rankings, and company-specific hiring criteria. This dynamic adaptation enables personalized recruitment, bridging the gap between AI automation and talent acquisition. We assess the effectiveness of our approach by comparing AI-generated scores with ratings provided by HR professionals on a dataset of anonymized online resumes. The findings highlight the potential of multi-agent RAG-LLM systems in automating resume screening, enabling more efficient and scalable hiring workflows.

AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening

TL;DR

We address the challenge of high-volume, objective resume screening by proposing a context-aware, multi-agent framework that uses Retrieval-Augmented Generation (RAG) to dynamically tailor evaluations to each job role. The system comprises four agents—resume extractor, evaluator, summarizer, and score formatter—to parse resumes, score against role-specific criteria, generate explainable feedback, and output standardized results; the evaluator retrieves external knowledge via RAG, enabling with without fine-tuning the model. Experimental results on 105 anonymized resumes show that multi-agent RAG-LLMs (e.g., with DeepSeek-V3) achieve higher and and lower than single-model baselines, with , , and , and strong alignment with HR assessments. The framework improves transparency and adaptability by decoupling extraction, evaluation, and feedback, and it can adapt to diverse hiring criteria across industries without retraining, demonstrating practical impact for scalable, fair, and explainable AI-driven hiring.

Abstract

Resume screening is a critical yet time-intensive process in talent acquisition, requiring recruiters to analyze vast volume of job applications while remaining objective, accurate, and fair. With the advancements in Large Language Models (LLMs), their reasoning capabilities and extensive knowledge bases demonstrate new opportunities to streamline and automate recruitment workflows. In this work, we propose a multi-agent framework for resume screening using LLMs to systematically process and evaluate resumes. The framework consists of four core agents, including a resume extractor, an evaluator, a summarizer, and a score formatter. To enhance the contextual relevance of candidate assessments, we integrate Retrieval-Augmented Generation (RAG) within the resume evaluator, allowing incorporation of external knowledge sources, such as industry-specific expertise, professional certifications, university rankings, and company-specific hiring criteria. This dynamic adaptation enables personalized recruitment, bridging the gap between AI automation and talent acquisition. We assess the effectiveness of our approach by comparing AI-generated scores with ratings provided by HR professionals on a dataset of anonymized online resumes. The findings highlight the potential of multi-agent RAG-LLM systems in automating resume screening, enabling more efficient and scalable hiring workflows.

Paper Structure

This paper contains 26 sections, 10 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration diagram of fine-tuned LLM and RAG-LLM for resume screening. (a) Traditional fine-tuning approaches (e.g., LoRA) require updating model parameters to adapt to new tasks (i.e., new companies' hiring requirements). (b) Our model uses RAG, eliminating the need for fine-tuning by dynamically retrieving relevant information from external sources.
  • Figure 2: The evolution of AI-driven hiring technologies. This figure presents the transition of AI-driven hiring methods across three major eras: traditional machine learning (2010-2016), deep learning (2016-2022), and large language models (2022-present). It highlights key advancements in AI hiring technologies and notable case studies demonstrating their real-world applications LinkedInBrightAcquisitionForbesCheckrSAP2019GoogleHire2017samadhiya2022importance.
  • Figure 3: Illustration of the proposed multi-agent framework for resume screening. The framework consists of four core agents: Resume extractor, responsible for parsing and structuring resume content; Resume evaluator, which assigns scores based on predefined criteria while integrating external knowledge via RAG; Resume summarizer, which consists of three sub-agents that generate feedback through collective decision-making, ensuring a comprehensive evaluation of the candidate's strengths and weaknesses; Score formatter, which organizes evaluation results into a structured format for future analysis. This modular approach enhances explainability and adaptability, as recruiters can review each step of the evaluation process without requiring to examine the raw resume directly.
  • Figure 4: Query formulation for resume evaluation agent. The query instructs the system to score extracted resume details by assessing skills, work experience, and education in relation to the applied job (J). It incorporates retrieved knowledge chunks (C) to ensure job-specific scoring criteria are considered.
  • Figure 5: Comparison of candidate scores assigned by human evaluators (HR) and a RAG-LLM (DeepSeek-V3). (a) The scatter plot showing the distribution of scores (b) Histogram showing the number of candidates in each score range based on HR and LLM evaluations.
  • ...and 2 more figures