Table of Contents
Fetching ...

What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claims

Jason Jones, Wenxin Jiang, Nicholas Synovic, George K. Thiruvathukal, James C. Davis

TL;DR

The paper addresses the lack of a cohesive, quantitative understanding of Pre-Trained Model (PTM) reuse on Hugging Face by performing a two-stage study: a systematic literature review to extract qualitative claims about PTM reuse, followed by large-scale quantitative analyses that quantify these claims and compare PTM reuse dynamics to traditional software registries. It finds that the Transformers library dominates descendant reuse on Hugging Face, that turnover among top PTMs is high compared with traditional registries, and that better documentation correlates with greater popularity and reuse. The work operationalizes qualitative claims into measurable metrics and validates several of them with public PTM and traditional SAR datasets (e.g., PeaTMOSS, HF Model Metadata, PTMTorrent, Ecosyste.ms), contributing to a more rigorous understanding of the PTM supply chain. These findings inform platform design and metrics for PTM reuse, suggesting focused improvements in versioning, model lineage tracking, and documentation practices to support practitioners and researchers.

Abstract

Background: Collaborative Software Package Registries (SPRs) are an integral part of the software supply chain. Much engineering work synthesizes SPR package into applications. Prior research has examined SPRs for traditional software, such as NPM (JavaScript) and PyPI (Python). Pre-Trained Model (PTM) Registries are an emerging class of SPR of increasing importance, because they support the deep learning supply chain. Aims: Recent empirical research has examined PTM registries in ways such as vulnerabilities, reuse processes, and evolution. However, no existing research synthesizes them to provide a systematic understanding of the current knowledge. Some of the existing research includes qualitative claims lacking quantitative analysis. Our research fills these gaps by providing a knowledge synthesis and quantitative analyses. Methods: We first conduct a systematic literature review (SLR). We then observe that some of the claims are qualitative. We identify quantifiable metrics associated with those claims, and measure in order to substantiate these claims. Results: From our SLR, we identify 12 claims about PTM reuse on the HuggingFace platform, 4 of which lack quantitative validation. We successfully test 3 of these claims through a quantitative analysis, and directly compare one with traditional software. Our findings corroborate qualitative claims with quantitative measurements. Our findings are: (1) PTMs have a much higher turnover rate than traditional software, indicating a dynamic and rapidly evolving reuse environment within the PTM ecosystem; and (2) There is a strong correlation between documentation quality and PTM popularity. Conclusions: We confirm qualitative research claims with concrete metrics, supporting prior qualitative and case study research. Our measures show further dynamics of PTM reuse, inspiring research infrastructure and new measures.

What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claims

TL;DR

The paper addresses the lack of a cohesive, quantitative understanding of Pre-Trained Model (PTM) reuse on Hugging Face by performing a two-stage study: a systematic literature review to extract qualitative claims about PTM reuse, followed by large-scale quantitative analyses that quantify these claims and compare PTM reuse dynamics to traditional software registries. It finds that the Transformers library dominates descendant reuse on Hugging Face, that turnover among top PTMs is high compared with traditional registries, and that better documentation correlates with greater popularity and reuse. The work operationalizes qualitative claims into measurable metrics and validates several of them with public PTM and traditional SAR datasets (e.g., PeaTMOSS, HF Model Metadata, PTMTorrent, Ecosyste.ms), contributing to a more rigorous understanding of the PTM supply chain. These findings inform platform design and metrics for PTM reuse, suggesting focused improvements in versioning, model lineage tracking, and documentation practices to support practitioners and researchers.

Abstract

Background: Collaborative Software Package Registries (SPRs) are an integral part of the software supply chain. Much engineering work synthesizes SPR package into applications. Prior research has examined SPRs for traditional software, such as NPM (JavaScript) and PyPI (Python). Pre-Trained Model (PTM) Registries are an emerging class of SPR of increasing importance, because they support the deep learning supply chain. Aims: Recent empirical research has examined PTM registries in ways such as vulnerabilities, reuse processes, and evolution. However, no existing research synthesizes them to provide a systematic understanding of the current knowledge. Some of the existing research includes qualitative claims lacking quantitative analysis. Our research fills these gaps by providing a knowledge synthesis and quantitative analyses. Methods: We first conduct a systematic literature review (SLR). We then observe that some of the claims are qualitative. We identify quantifiable metrics associated with those claims, and measure in order to substantiate these claims. Results: From our SLR, we identify 12 claims about PTM reuse on the HuggingFace platform, 4 of which lack quantitative validation. We successfully test 3 of these claims through a quantitative analysis, and directly compare one with traditional software. Our findings corroborate qualitative claims with quantitative measurements. Our findings are: (1) PTMs have a much higher turnover rate than traditional software, indicating a dynamic and rapidly evolving reuse environment within the PTM ecosystem; and (2) There is a strong correlation between documentation quality and PTM popularity. Conclusions: We confirm qualitative research claims with concrete metrics, supporting prior qualitative and case study research. Our measures show further dynamics of PTM reuse, inspiring research infrastructure and new measures.
Paper Structure (35 sections, 1 equation, 8 figures, 4 tables)

This paper contains 35 sections, 1 equation, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Overview of this work's context and approach. Pre-trained deep neural network models (PTMs) are beginning to be reused, and accompanying empirical research is emerging. This work provides the first systematic literature review on PTM reuse. We extract the claims in prior work (RQ1) and provide quantitative evaluation of un-quantified and under-quantified claims (RQ2).
  • Figure 2: The software supply chain. Software package registries connect package authors to reusers, accelerating system development.
  • Figure 3: Reuse processes of traditional software and of PTMs, as reported by Jiang et al.jiang_empirical_2023. Reuse processes are similar, suggesting that SPR measurements and trends may be similar.
  • Figure 4: Systematic literature review process and results.
  • Figure 5: The usage proportion of the top-10 libraries that PTMs utilizing at least two different libraries use on Hugging Face. For PTM packages that leverage support at least two libraries, most packages support the transformers library, followed by the Hugging Face promoted SafeTensors library. Most other libraries have little usage in comparison. In contrast to previous work castano_analyzing_2023, PyTorch is not one of the most popular library to be supported when a PTM package supports more than one library.
  • ...and 3 more figures