Table of Contents
Fetching ...

Responsible AI in Open Ecosystems: Reconciling Innovation with Risk Assessment and Disclosure

Mahasweta Chakraborti, Bert Joseph Prestoza, Nicholas Vincent, Seth Frey

TL;DR

This paper investigates responsible AI within open, informal ecosystems by analyzing Hugging Face data to link risk documentation with evaluation and accuracy. It analyzes 7903 HF projects and 789 leaderboard submissions to quantify how disclosure of risks and limitations co-occurs with performance reporting. The results show a strong positive association between evaluation and risk documentation, yet high-performing leaderboard entries are less likely to document risks, suggesting a tension between performance emphasis and responsible disclosure. The authors propose governance-oriented interventions—such as multi-metric benchmarks and streamlined risk-reporting guidelines—to preserve open-source innovation while improving ethical uptake and accountability.

Abstract

The rapid scaling of AI has spurred a growing emphasis on ethical considerations in both development and practice. This has led to the formulation of increasingly sophisticated model auditing and reporting requirements, as well as governance frameworks to mitigate potential risks to individuals and society. At this critical juncture, we review the practical challenges of promoting responsible AI and transparency in informal sectors like OSS that support vital infrastructure and see widespread use. We focus on how model performance evaluation may inform or inhibit probing of model limitations, biases, and other risks. Our controlled analysis of 7903 Hugging Face projects found that risk documentation is strongly associated with evaluation practices. Yet, submissions (N=789) from the platform's most popular competitive leaderboard showed less accountability among high performers. Our findings can inform AI providers and legal scholars in designing interventions and policies that preserve open-source innovation while incentivizing ethical uptake.

Responsible AI in Open Ecosystems: Reconciling Innovation with Risk Assessment and Disclosure

TL;DR

This paper investigates responsible AI within open, informal ecosystems by analyzing Hugging Face data to link risk documentation with evaluation and accuracy. It analyzes 7903 HF projects and 789 leaderboard submissions to quantify how disclosure of risks and limitations co-occurs with performance reporting. The results show a strong positive association between evaluation and risk documentation, yet high-performing leaderboard entries are less likely to document risks, suggesting a tension between performance emphasis and responsible disclosure. The authors propose governance-oriented interventions—such as multi-metric benchmarks and streamlined risk-reporting guidelines—to preserve open-source innovation while improving ethical uptake and accountability.

Abstract

The rapid scaling of AI has spurred a growing emphasis on ethical considerations in both development and practice. This has led to the formulation of increasingly sophisticated model auditing and reporting requirements, as well as governance frameworks to mitigate potential risks to individuals and society. At this critical juncture, we review the practical challenges of promoting responsible AI and transparency in informal sectors like OSS that support vital infrastructure and see widespread use. We focus on how model performance evaluation may inform or inhibit probing of model limitations, biases, and other risks. Our controlled analysis of 7903 Hugging Face projects found that risk documentation is strongly associated with evaluation practices. Yet, submissions (N=789) from the platform's most popular competitive leaderboard showed less accountability among high performers. Our findings can inform AI providers and legal scholars in designing interventions and policies that preserve open-source innovation while incentivizing ethical uptake.
Paper Structure (20 sections, 4 figures, 4 tables)

This paper contains 20 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) Growth of the number of projects on HF Hub after the release of their client library in December 2020. (b) We also map development trends as the number of projects by modality among service-ready models uploaded since 2021. Natural Language Processing is consistently the most sought-after AI/ML application, closely followed by Reinforcement Learning, Computer Vision, and Audio. The trend over time (mean with 95% confidence interval) in (c) model sizes and (d) training data requirements among 140,007 and 17,251 projects uploaded since 2021 showed a discernible increase in development scale.
  • Figure 2: Contribution trends among all 456,545 service-ready projects. Most contributions are individual projects rather than collaborative, with 87.57% having 0 pull requests and 86.34% uploaded by single-member accounts. Around 5.46% models have been deployed in applications (spaces).
  • Figure 3: Documentation Rates between of Model Card components. Among 456,545 usable models, evaluations were most documented (15.9%), while risks and limitations were found among 2.2%. Finally, ${CO_{2}}$ emissions saw the least reporting at 0.7%. About 0.7% contained both evaluations and limitations. Only around 0.1% of the models complete all three sections.
  • Figure 4: a. We find company contributions showing the highest growth rate between 2022 and 2024., surpassing universities and non-profits b. Noticeable differences exist in documentation across different developer/provider types. Non-profits lead among organizations in providing model evaluations. On the whole, non-profits, companies, and universities document risks more than the population average.