Data and System Perspectives of Sustainable Artificial Intelligence
Tao Xie, David Harel, Dezhi Ran, Zhenwen Li, Maoliang Li, Zhi Yang, Leye Wang, Xiang Chen, Ying Zhang, Wentao Zhang, Meng Li, Chen Zhang, Linyi Li, Assaf Marron
TL;DR
The paper addresses the significant environmental and resource challenges of Sustainable AI by examining data and system perspectives across data acquisition, processing, and AI model training/inference. It combines a survey of current issues with concrete opportunities and example solutions, including data synthesis, IDS-based privacy-preserving data sharing, automated data-cleaning and feature engineering, and energy-efficient hardware and DSAs. A key contribution is outlining practical paths—such as federated learning, RISC-V-based accelerators, and hardware-software co-optimization—that can reduce energy use while maintaining performance for large-scale models, exemplified by discussions around GPT-3’s training energy and parameter scale. The work emphasizes interdisciplinary collaboration among data governance, machine learning, and hardware design to enable scalable, trustworthy, and environmentally sustainable AI deployment, with practical guidance for researchers and industry on implementing these approaches.
Abstract
Sustainable AI is a subfield of AI for concerning developing and using AI systems in ways of aiming to reduce environmental impact and achieve sustainability. Sustainable AI is increasingly important given that training of and inference with AI models such as large langrage models are consuming a large amount of computing power. In this article, we discuss current issues, opportunities and example solutions for addressing these issues, and future challenges to tackle, from the data and system perspectives, related to data acquisition, data processing, and AI model training and inference.
