RecruitView: A Multimodal Dataset for Predicting Personality and Interview Performance for Human Resources Applications
Amit Kumar Gupta, Farhan Sheth, Hammad Shaikh, Dheeraj Kumar, Angkul Puniya, Deepak Panwar, Sandeep Chaurasia, Priya Mathur
TL;DR
RecruitView provides a large, in-the-wild multimodal interview dataset with psychometrically grounded, continuous labels for 12 personality and interview-performance targets. The authors introduce CRMF, a geometry-aware fusion framework that processes visual, audio, and text signals across hyperbolic, spherical, and Euclidean manifolds with adaptive routing and tangent-space fusion. Empirical results show CRMF consistently outperforms strong large multimodal baselines while using substantially fewer trainable parameters, highlighting the value of manifold-aware representations for behavioral prediction. The work advances multimodal behavioral analysis by integrating multi-geometry inductive biases, enabling more reliable personality and interview-performance assessment in HR contexts, and provides publicly available data and code for reproducible research.
Abstract
Automated personality and soft skill assessment from multimodal behavioral data remains challenging due to limited datasets and methods that fail to capture geometric structure inherent in human traits. We introduce RecruitView, a dataset of 2,011 naturalistic video interview clips from 300+ participants with 27,000 pairwise comparative judgments across 12 dimensions: Big Five personality traits, overall personality score, and six interview performance metrics. To leverage this data, we propose Cross-Modal Regression with Manifold Fusion (CRMF), a geometric deep learning framework that explicitly models behavioral representations across hyperbolic, spherical, and Euclidean manifolds. CRMF employs geometry-specific expert networks to capture hierarchical trait structures, directional behavioral patterns, and continuous performance variations simultaneously. An adaptive routing mechanism dynamically weights expert contributions based on input characteristics. Through principled tangent space fusion, CRMF achieves superior performance while training 40-50% fewer trainable parameters than large multimodal models. Extensive experiments demonstrate that CRMF substantially outperforms the selected baselines, achieving up to 11.4% improvement in Spearman correlation and 6.0% in concordance index. Our RecruitView dataset is publicly available at https://huggingface.co/datasets/AI4A-lab/RecruitView
