Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem
Shayne Longpre, Christopher Akiki, Campbell Lund, Atharva Kulkarni, Emily Chen, Irene Solaiman, Avijit Ghosh, Yacine Jernite, Lucie-Aimée Kaffee
TL;DR
The paper leverages a comprehensive, longitudinal dataset of Hugging Face Model Hub downloads (2020–2025) to map how power concentrates and diffuses across models, developers, and nations in the open AI ecosystem. Using rolling-window usage signals, annotated metadata, and economic concentration metrics (HHI, Gini), it reveals a dramatic shift from US industry dominance to unaffiliated/online communities and Chinese developers, alongside a rise of intermediary re-packagers. It documents a parallel technical transformation toward larger, multimodal, and quantized architectures, with increased reliance on mixture-of-experts and substantial declines in data transparency and open-source alignment. The work contributes a publicly released dataset and dashboard to enable ongoing monitoring, governance considerations, and informed policy discussions about openness, participation, and competition in open AI ecosystems.
Abstract
Since 2019, the Hugging Face Model Hub has been the primary global platform for sharing open weight AI models. By releasing a dataset of the complete history of weekly model downloads (June 2020-August 2025) alongside model metadata, we provide the most rigorous examination to-date of concentration dynamics and evolving characteristics in the open model economy. Our analysis spans 851,000 models, over 200 aggregated attributes per model, and 2.2B downloads. We document a fundamental rebalancing of economic power: US open-weight industry dominance by Google, Meta, and OpenAI has declined sharply in favor of unaffiliated developers, community organizations, and, as of 2025, Chinese industry, with DeepSeek and Qwen models potentially heralding a new consolidation of market power. We identify statistically significant shifts in model properties, a 17X increase in average model size, rapid growth in multimodal generation (3.4X), quantization (5X), and mixture-of-experts architectures (7X), alongside concerning declines in data transparency, with open weights models surpassing truly open source models for the first time in 2025. We expose a new layer of developer intermediaries that has emerged, focused on quantizing and adapting base models for both efficiency and artistic expression. To enable continued research and oversight, we release the complete dataset with an interactive dashboard for real-time monitoring of concentration dynamics and evolving properties in the open model economy.
