The Hidden AI Race: Tracking Environmental Costs of Innovation
Shyam Agarwal, Mahasweta Chakraborti
TL;DR
This study investigates the environmental footprint of AI development by linking carbon emissions to model size, repository activity, domain, and organizational context using a GLM on 782 HuggingFace model-card entries. It finds strong associations between emissions and factors such as parameter count, commits, and repository age, with domain and affiliation effects revealing higher emissions for audio models and university-driven work, while community-driven projects tend to be more efficient. The work highlights domain-specific optimization opportunities and advocates for green AI practices, including energy-efficient architectures and sustainable development workflows, while acknowledging limitations of observational data and reporting inconsistencies. Overall, the paper provides a data-driven framework to identify actionable levers for reducing AI's environmental impact and sets directions for standardized measurement and cross-sector collaboration in sustainable AI research.
Abstract
The past decade has seen a massive rise in the popularity of AI systems, mainly owing to the developments in Gen AI, which has revolutionized numerous industries and applications. However, this progress comes at a considerable cost to the environment as training and deploying these models consume significant computational resources and energy and are responsible for large carbon footprints in the atmosphere. In this paper, we study the amount of carbon dioxide released by models across different domains over varying time periods. By examining parameters such as model size, repository activity (e.g., commits and repository age), task type, and organizational affiliation, we identify key factors influencing the environmental impact of AI development. Our findings reveal that model size and versioning frequency are strongly correlated with higher emissions, while domain-specific trends show that NLP models tend to have lower carbon footprints compared to audio-based systems. Organizational context also plays a significant role, with university-driven projects exhibiting the highest emissions, followed by non-profits and companies, while community-driven projects show a reduction in emissions. These results highlight the critical need for green AI practices, including the adoption of energy-efficient architectures, optimizing development workflows, and leveraging renewable energy sources. We also discuss a few practices that can lead to a more sustainable future with AI, and we end this paper with some future research directions that could be motivated by our work. This work not only provides actionable insights to mitigate the environmental impact of AI but also poses new research questions for the community to explore. By emphasizing the interplay between sustainability and innovation, our study aims to guide future efforts toward building a more ecologically responsible AI ecosystem.
