Exploring the Role of Women in Hugging Face Organizations
Maria Tubella Salinas, Alexandra González, Silverio Martínez-Fernández
TL;DR
This study assesses gender diversity within Hugging Face organizations in ML model development using repository mining of HF data and LLM-informed gender inferences. It analyzes organizational and individual contributions, model popularity, and commit activity to understand how gender dynamics relate to open-source ML ecosystems. The findings reveal a pronounced underrepresentation of women in HF organizations and among commit authors, with no consistent link between gender diversity and model downloads, while bots and male contributors dominate commits. The work highlights persistent inequities in open-source ML communities and suggests the need for broader inclusion efforts and more nuanced metrics to capture the impact of contributions beyond mere commit counts.
Abstract
Background: Despite its impact on innovation, gender diversity remains far from fully being achieved in open-source projects. Aims: We examine gender diversity in Hugging Face (HF) organizations, investigating its impact on innovation and team dynamics in open-source development projects. Method: We conducted a repository mining study, focusing on ML model development projects on HF, to explore the involvement of women in collaborative processes. Results: Women are highly underrepresented in both organizations and commits distribution, which is also found when analyzing individual developers. Conclusions: Addressing gender disparities is essential to create more equitable, diverse, and inclusive open-source ecosystems.
