Table of Contents
Fetching ...

Exploring the Role of Women in Hugging Face Organizations

Maria Tubella Salinas, Alexandra González, Silverio Martínez-Fernández

TL;DR

This study assesses gender diversity within Hugging Face organizations in ML model development using repository mining of HF data and LLM-informed gender inferences. It analyzes organizational and individual contributions, model popularity, and commit activity to understand how gender dynamics relate to open-source ML ecosystems. The findings reveal a pronounced underrepresentation of women in HF organizations and among commit authors, with no consistent link between gender diversity and model downloads, while bots and male contributors dominate commits. The work highlights persistent inequities in open-source ML communities and suggests the need for broader inclusion efforts and more nuanced metrics to capture the impact of contributions beyond mere commit counts.

Abstract

Background: Despite its impact on innovation, gender diversity remains far from fully being achieved in open-source projects. Aims: We examine gender diversity in Hugging Face (HF) organizations, investigating its impact on innovation and team dynamics in open-source development projects. Method: We conducted a repository mining study, focusing on ML model development projects on HF, to explore the involvement of women in collaborative processes. Results: Women are highly underrepresented in both organizations and commits distribution, which is also found when analyzing individual developers. Conclusions: Addressing gender disparities is essential to create more equitable, diverse, and inclusive open-source ecosystems.

Exploring the Role of Women in Hugging Face Organizations

TL;DR

This study assesses gender diversity within Hugging Face organizations in ML model development using repository mining of HF data and LLM-informed gender inferences. It analyzes organizational and individual contributions, model popularity, and commit activity to understand how gender dynamics relate to open-source ML ecosystems. The findings reveal a pronounced underrepresentation of women in HF organizations and among commit authors, with no consistent link between gender diversity and model downloads, while bots and male contributors dominate commits. The work highlights persistent inequities in open-source ML communities and suggests the need for broader inclusion efforts and more nuanced metrics to capture the impact of contributions beyond mere commit counts.

Abstract

Background: Despite its impact on innovation, gender diversity remains far from fully being achieved in open-source projects. Aims: We examine gender diversity in Hugging Face (HF) organizations, investigating its impact on innovation and team dynamics in open-source development projects. Method: We conducted a repository mining study, focusing on ML model development projects on HF, to explore the involvement of women in collaborative processes. Results: Women are highly underrepresented in both organizations and commits distribution, which is also found when analyzing individual developers. Conclusions: Addressing gender disparities is essential to create more equitable, diverse, and inclusive open-source ecosystems.

Paper Structure

This paper contains 20 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Repository Mining Study: Data Collection, Augmentation, and Analysis.
  • Figure 2: Normalized Gender Distribution of the Top 15 Organizations.
  • Figure 3: Top 20 Authors by Number of Models they Committed in.
  • Figure 4: Number of Commits by Gender for the Top 15 Most Downloaded Models.