Gender Representation and Bias in Indian Civil Service Mock Interviews
Somonnoy Banerjee, Sujan Dutta, Soumyajit Datta, Ashiqur R. KhudaBukhsh
TL;DR
The study analyzes gender representation and bias in UPSC mock interviews by constructing a large dataset of 51,278 questions from 888 videos. It demonstrates gender-biased questioning patterns, a predominantly male interviewer panel, and societal biases reflected in LLM explanations for gender inference. The work introduces a public dataset and a robust methodological pipeline for auditing bias in conversational content, with implications for fairness in high-stakes selection processes. The findings underscore the need for bias-aware reforms and systematic AI-explanation audits in educational and public-sector contexts.
Abstract
This paper makes three key contributions. First, via a substantial corpus of 51,278 interview questions sourced from 888 YouTube videos of mock interviews of Indian civil service candidates, we demonstrate stark gender bias in the broad nature of questions asked to male and female candidates. Second, our experiments with large language models show a strong presence of gender bias in explanations provided by the LLMs on the gender inference task. Finally, we present a novel dataset of 51,278 interview questions that can inform future social science studies.
